MFXVideoDECODE_DecodeHeader and Windows Media Foundation

chatelier__pierre · ‎04-07-2022

[context]

I want to use oneAPI/vpl/MFXVideoDECODE for raw samples provided by an IMFSourceReader (part of Microsoft Windows Media Foundation). That way, I delegate the media reading to Media Foundation, and let H264 decoding occur through VPL. I need to to that explicitely, I don't want to use "plugins".

[what works]

I have a valid avi file containing valid H264 (AVC1) samples, that can be played by any movie player.

I can myself handle that file with the Media Foundation API.

[what does not work]

In order to call MFXVideoDECODE_Init() , I need a valid mfxVideoParam structure filled with information matching the movie.

Since it seems really difficult to fill correctly a mfxVideoParam by hand, and that there is no way to fill it from an IMFMediaType, I have to resort to MFXVideoDECODE_DecodeHeader().

However, whatever I send to MFXVideoDECODE_DecodeHeader(), it pretends to succeed but does not fill correctly the structure, making the next call to MFXVideoDECODE_Init() fail miserably

[some details]

I tried to feed the input bitstream of MFXVideoDECODE_DecodeHeader() with

-the complete raw file bytes

-the first IMFMediaSample from the IMFSourceReader

-the MF_MT_MPEG_SEQUENCE_HEADER blob from the IMFMediaType (that contains the NALUs)

-as expected, I also manually set the mfx.codecId of the mfxVideoParam as MFX_CODEC_AVC

in any case, MFXVideoDECODE_DecodeHeader() returns success, but no mfx.FrameInfo fields are filled, and of course MFXVideoDECODE_Init() will fail

[the question]

What am I supposed to do to get a correct mfxVideoParam from a sequence opened in Windows Media Foundation ?

chatelier__pierre · ‎04-08-2022

(Follow-up)

Problems pile up, driving me crazy and making me believe that oneAPI VPL is just not ready for real usage...

-If I manually fill the mfxVideoParam.mfx.FrameInfo.Width/Height/CropW/CropH fields, I can finally call MFXVideoDECODE_Query() and get a mfxVideoParam accepted by MFXVideoDECODE_Init() (see initial post in this thread)

-First problem : if just after a successful MFXVideoDECODE_Init(session, &localDecodeParams), I call MFXVideoDECODE_Reset(session, &decodeParams), I get an error of unsupported params. This is just non sense.

-Now, let's assume I can use my mfxSession (because it kind of work a little anyway).

-I can use an IMFSourceReader to read raw H264 samples from my media file. Since I know how an AVC stream work, I know that some samples are I frames, other are P frames. When I want to decode a frame at time <t>, I use IMFSourceReader::SetCurrentPosition() to the closest previous I frame before <t>, then call IMFSourceReader::ReadSample() until timestamp <t> is reached. This is a scheme I regularly use and I know I am doing it correctly.

-So, when decoding a frame at time <t>, I can just fill an mfxBitstream with all the required raw samples I, P, P... and expect a single call to MFXVideoDECODE_DecodeFrameAsync() to output a frame (the one of interest being here the last one after all bytes are consumed)

-But it is not the case. MFXVideoDECODE_DecodeFrameAsync() will always return MFX_ERR_MORE_DATA. even for the very first raw sample a time 0.

-Obviously, this is wrong, there is enough data. I tried to add the data flag MFX_BITSTREAM_COMPLETE_FRAME, but it won't help. Of course, my decodeparam.AsyncDepth is 1 to avoid latency.

-You can also note that all bytes are consumed in the bitstream : the DataLength field is set to 0 by the call to MFXVideoDECODE_DecodeFrameAsync()

-I also tried to call MFXVideoDECODE_DecodeFrameAsync() for each raw sample, expecting the last one to be ok, but the problem is the same : I still get only MFX_ERR_MORE_DATA.

-If I cheat and keep adding the next samples to the bitstream, I can eventually get a surface to decode, but it is usually corrupted (most pixels are good, many are wrong).

-Sometimes, while cheating and sending extra samples, it can even happen than my input stream becomes exhausted (I reach end of stream) and MFXVideoDECODE_DecodeFrameAsync() still outputs MFX_ERR_MORE_DATA. Very unpleasant.

-When I get a surface (thanks to extra samples), I can make a (corrupted) image with it. But when I try to move to another time position, there is usually an hysteresys behaviour that just doesn't give me the expected image afterwards. I would like to be able to call MFXVideoDECODE_Reset(), but since it produces an error from the very beginning, I cannot rely on it .

-Maybe is it an async problem related ? MFXVideoDECODE_DecodeFrameAsync() is supposed to be asynchronous, but how I am supposed to wait for completion ? I need to call MFXVideoDECODE_DecodeFrameAsync() to get the syncpoint, in order to call MFXVideoCORE_SyncOperation(), but afterwards I can't call MFXVideoDECODE_DecodeFrameAsync() any more, since all my bytes have been consumed. How can the output of MFXVideoDECODE_DecodeFrameAsync() be synchronous while the function itself is asynchronous ? This is API nonsense (or very badly documented).

-Maybe the answer is FrameInterface->Synchronize(), but it requires that MFXVideoDECODE_DecodeFrameAsync() was successful to get a valid surface providing the FrameInterface...

It seems to me that MFXVideoDECODE_DecodeFrameAsync() is mostly broken in my real usage scenario.

In previous betas of oneAPI VPL, there was a vpl::WorkStream::DecodeFrame() that just worked ! It was doing the job correctly, giving me the expedted vpl_mem surfaces.

I did not have such problems with the Intel Media SDK as well. Things seems to degrade with the new mfx implementation upon the vpl API.

Can you give hints about workarounds ?

AlekhyaV_Intel · ‎04-08-2022

Hi Pierre,

Thank you for posting in Intel Communities. H264 input format is not supported in oneVPL. Currently oneVPL supports H265 input format for Decoding. So, I am afraid that any of the modifications to mfxVideoParam would help.

You could try Intel Media SDK as it supports decoding AVC format video files too. Please refer the below documentations which has supported input formats for Intel Media SDK:

For Linux:

https://github.com/Intel-Media-SDK/MediaSDK This consists of supported encoders & decoders in 'Readme.rst' file.

https://github.com/Intel-Media-SDK/MediaSDK/blob/master/doc/samples/readme-decode_linux.md

Please download Intel Media SDK from here: https://www.intel.com/content/www/us/en/developer/tools/media-sdk/choose-download.html

Regards,

Alekhya

chatelier__pierre · ‎04-08-2022

The documentations tells that CPU implementation does support H264 :

https://github.com/oneapi-src/oneVPL-cpu/blob/master/README.md

Also, you are obviously wrong assuming that it is not supported, because when I cheat and push extra samples to get a surface, the frame IS indeed decoded and apart from some corrupted pixels (certainly because of I/P mess), it is mostly correct.

Moreover, since the Intel Media SDK is deprecated, I expected oneVPL to totally supersede it ?!

Lacking AVC support in oneVPL means that I still have to maintain my code based on Intel media SDK with its own dispatcher, and at the same time having a complete different branch for other codecs that would go through VPL and its new dispatch architecture.

This feels very uncomfortable for long-term code maintenance.

chatelier__pierre · ‎04-08-2022

If you want, you can check all the problems I reported in the attached minimal project

chatelier__pierre · ‎04-08-2022

A new proof that something is broken in MFXVideoDECODE_DecodeFrameAsync()

So, I explained how MFXVideoDECODE_DecodeFrameAsync(session, bs, nullptr, &decSurfaceOut, &syncp) was not providing the expected surface as soon as the very first I frame sample.

But if I immediately call a drain with MFXVideoDECODE_DecodeFrameAsync(session, bs, nullptr, &decSurfaceOut, &syncp), the expected surface is delivered !

Of course, the problem is that after such a drain, I would have to close/reinit the whole session to submit new samples (otherwise MFX_ERR_ABORTED is raised)

For me, it means that there IS enough data after my samples submission, but the internals of MFXVideoDECODE_DecodeFrameAsync() are just user-unfriendly and won't deliver them correctly on demand.

Ying_Guo_VPL · ‎04-09-2022

Hi Chatelier, Thanks for reach out to us. Can we start with an out-of-box issue before discussing issues that require more complex steps to reproduce? Let's start with an issue that you think something is broken when you build our repository without any changes, what still does not work for you? Thanks...ying

chatelier__pierre · ‎04-10-2022

Hello,

I am sorry, but I do not understand what you mean by "building our repository" : I just want to deploy the openapiVPL SDK that you distribute, and build software upon it.

When I say "it is broken", it is that compared to deprecated Intel Media SDK or early oneAPI VPL beta (with vplWorkstream usage), it fails at delivering the same service.

The service I am talking about is "decoding frames synchronously at random positions in a media where raw samples are provided by an external API".

Attached to this thread, there is a minimal Visual Studio project that precisely shows all the failures encountered :

-the external API is Windows Media Foundation, along with a valid sample media file H264 encoded

-using MFXVideoDECODE_DecodeHeader() fails at providing any usable mfxVideoParams despite many attempts to give it valid information

-using MFXVideoDECODE_Reset() fails even with mfxVideoParams validated by a successful MFXVideoDECODE_Init()

-MFXVideoDECODE_DecodeFrameAsync() fails at delivering a surface even for the very first I frame of the media, despite many attempts to have a zero-latency-complete-frame submission

You just have to run my sample code and see that it does not work as it should.

The only answer I had so far on this thread was "AVC is not supported", which is wrong for many reasons (documentation claims it should work, and when cheating to force some MFXVideoDECODE_DecodeFrameAsync() output, we can get valid frames)

AlekhyaV_Intel · ‎04-12-2022

Hi Pierre,

Thank you for sharing your sample reproducer. Could you please let us know your system details like OS, hardware, visual studio version, etc. & oneVPL version so that we can try to reproduce your issue?

Regards,

Alekhya

chatelier__pierre · ‎04-12-2022

Sure,

Windows 10 64 bits

Visual Studio 2019 (16.11.11)

Latest oneAPI VPL to this date (2022.0.1 from w_BaseKit_p_2022.1.3.210)

Ying_Guo_VPL · ‎04-15-2022

Hi Pierre, Could you provide the video file you used for decoding so that I can try reproduce? Thanks...ying

chatelier__pierre · ‎05-01-2022

Hello,

Any news on the subject ?

Even without a solution for the moment, do you acknowledge a bug on your side, or a misuse on mine ?

chatelier__pierre · ‎04-16-2022

You are not trying very hard...
Not only the video file is attached to the first post of the thread, but it is also part of the Visual Studio projet

AlekhyaV_Intel · ‎05-04-2022

Hi Pierre,

We apologize with the delay caused. We are checking on this with the team internally. We will get back to you with an update as soon as possible.

Regards,

Alekhya

chatelier__pierre · ‎06-01-2022

Hello,

It's been a while, do you still work on it ?

Pamela_H_Intel · ‎06-02-2022

Yes, Pierre, sorry for the delay. We are discussing the best path forward. Your test scenario is valid. You are doing nothing wrong. There are a couple of things we are looking at modifying so that you get what you need. We will get back to you.

roy5 · ‎07-29-2022

Hello,

Thanks for reaching out to us. Here is my response to the question:

oneVPL and Media SDK decoder work with elementary streams e.g. h264, h265, jpeg. AVI is a container format. The AVI input must be demuxed from its container to an h264 elementary stream before feeding to Media SDK or oneVPL. So, can you try to demux it and then use the generated h264 stream as input to the sample_decode provided with oneVPL?

Thanks,

Rupak

chatelier__pierre · ‎07-30-2022

But what is the purpose of such a test ? The bug is already confirmed in your previous answers.
If you ask me to change the way I submit sample, you will just deny the initial problem.

I want to submit input to oneVPL after using Media Foundation to decode/demux/you-name-it the original media file.

It used to work with Intel Media SDK, it does not work any more with one VPL. The input is not the culprit.

Moreover, since I provided you all the data, you can demux yourself to investigate if you know what you are looking for !

Pamela_H_Intel · ‎08-04-2022

Pierre,

Sorry that your path to get to a solution has been so arduous. I know that is terribly frustrating. I think we've solved the issue, but I also think there has been some confusion around your sample which is in avi format.

On 2.June I said:

Your test scenario is valid. You are doing nothing wrong. There are a couple of things we are looking at modifying so that you get what you need. We will get back to you.

When you provided your sample file, and did some testing to find out why your scenario was not working as you had expected.

What we found is that you are using a sample file in AVI format (note, this is not avi, not av1 - I think there was some confusion around that). We do not support avi. AVI is like MP4. It's not a codec. It's containerized sample. We don't directly support it. You need to un-containerize it.

As far as I can see, we've not ever supported container formats (MPEG/MP4/AVI) directly - I checked all the way from MediaSDK 2014 (section 3.1) to MediaSDK 1.35. These docs explain how you can create the elementary streams that are required by MediaSDK and oneVPL from container format streams like you have.

All that said, just because we don't support something, doesn't mean it won't work. It means we don't promise that it will work. So maybe something just happened to work for you in the past, because the variables happened to align?

Does that make sense?

Pamela

chatelier__pierre · ‎08-04-2022

>Does that make sense ?

Nope, not at all.

The fact that it is an AVI is not part of the problem, and you are still mentioning it !

At first, the initial question regarding MFXVideoDECODE_DecodeHeader() was indeed related to the format, but it was quickly replaced by the problem of decoding raw samples *right out of the Media Foundation API*

Please read the post again and check the sample code again :

I use the Media Foundation API to open the AVI and query raw H264 samples. The container, whatever it is (AVI here) is entirely handled by Media Foundation.

Then I submit the raw H264 samples to the oneAPI, there is no more container in game.

My bug report shows that the oneAPI is broken regarding MFXVideoDECODE_DecodeFrameAsync() events, not providing expected bitmaps when I, P frames hit a sync point.

Please re-read the sentence "If I cheat and keep adding the next samples to the bitstream, I can eventually get a surface to decode, but it is usually corrupted (most pixels are good, many are wrong)." Among many other information I gave you, it just shows that the data flow and the API usage is globally correct : there can be some good data. So the workflow is not "impossible because of AVI", otherwise it would not work at all.

If you are still mentioning the AVI format, then you have basically done 0 progress on this bug report since 4 months

Pamela_H_Intel · ‎08-07-2022

Pierre,

Thank you for bringing this to our attention. And thank you for your persistence in making us understand your issue with the oneVPL CPU implementation.

The oneVPL GPU implementation should work the same as MediaSDK.

But the oneVPL CPU implementation is not on par with the GPU implementation. The dev team has thanked both you and us for persisting and will use this documentation as reference for how we should move forward.

I have created a Jira where I reference this discussion so that the developers fully understand what customers are attempting to do, and what is unclear.

Again, thank you for your feedback.

Pamela