Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

decoder bugs

madshi_net
Beginner
1,889 Views
Using MSDK 3.0 beta 2 I've run into the following two bugs:

(1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA when being called with the following h264 header sequence:

00 00 00 01 67 64 00 33 AC 34 E6 02 C0 49 FB 84 00 00 0F A0 00 03 0D 42 3C 60 C6 68
00 00 00 01 68 EE BC B0

However, if I just add an AUD (00 00 00 01 09 10) to the end of that data, "DecodeHeader" is happy to do its job. Your own "MP4/AVC Decode using the Intel Media SDK" article/demo stumbles over the same problem. In that demo DecodeHeader always fails.

(2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts. The blocks are usually in areas where there's strong motion. The very next frame is clean. It's only the first frame which shows these artifacts. And it doesn't always occur, but very often. It occurs with and without "injected headers". If you need a sample to reproduce this problem, just let me know. I've seen the same problem with 2 very different files, though (one SD interlaced, the other Blu-Ray 24p), so I'm pretty sure it will happen with all files.

Here's a screenshot of the problem:

screenshot
0 Kudos
31 Replies
IDZ_A_Intel
Employee
1,339 Views
Hi,

We have recieved your questions and are looking into them. For your second question on VC1 video seek, could you provide the sample you mentioned? It might also help to know a little more about howthe file was created(if there were any additional steps on your part) andwhich players display the problem. Also, do you see the same artifacts if you decode straight to a YUV?

Thanks,

Jeff
0 Kudos
madshi_net
Beginner
1,339 Views
Hi Jeff,

thanks for the quick reply. The Intel Media SDK support is second to none!! :)

To answer your questions:

(1) The file is an interlaced SD m2ts, directly taken from a Blu-Ray. Of course I've verified that the problem only occurs with the Intel decoder. It does not occur with the Microsoft VC-1 decoder, so the file doesn't seem to be damaged.

(2) Here's a sample. I've used a hexeditor to cut out a section of the file (of course I've left the 192 byte m2ts container structure intact). Please let me know when you've downloaded the sample, so I can remove it from my server again. Thanks:

http://madshi.net/intelvc1.m2ts

(3) The problem occurs with MPC-HC, PotPlayer and also with GraphEdit. So it's not player related.

(4) Most DirectShow splitters don't handle interlaced VC-1 streams in a way that the Intel decoder likes. As a result, if headers are injected after a seek, heavy image corruption can occur. I've worked around this issue by remuxing the m2ts file into an MKV file ("eac3to intelvc1.m2ts 1: intelvc1.mkv") and then splitting the MKV file with the Haali MKV Splitter. This way the Intel decoder is happy with the splitter output and injecting the headers doesn't have any negative side effects, anymore.

(5) I've been testing with my own VC-1 DirectShow decoder using the Media SDK. But the same problems also occur with the official Intel VC-1 DirectShow filter which is installed by the Media SDK. In GraphEdit: "Haali Media Splitter (intelvc1.mkv) -> Official Media SDK Intel VC-1 Decoder -> Video Renderer".

Of course you can also use a different splitter, but I've found that using anything other than described above increases the problems instead of reducing them. The Intel VC-1 decoder seems to be very picky about which bitstream elements are grouped together for one "Decode" call. Which is ok, I guess, it just makes testing more complicated.

In order to reproduce the problem, I'd suggest to remux the m2ts file to MKV, then use GraphEdit, with "Haali Media Splitter (mkv) -> Intel VC-1 DirectShow decoder -> Video Renderer". Start playback, then pause playback, then in paused state seek around. Sometimes you'll get a blocky picture, sometimes not.

Edit:

> do you see the same artifacts if you decode straight to a YUV?

I've tested with 2 different video renderers, with NV12 output from the Intel decoder. The problem occurred with both renderers.
0 Kudos
IDZ_A_Intel
Employee
1,339 Views
I've got the file. Thanks to your detailed response we will have plenty to work with to reproduce the problem here.
0 Kudos
madshi_net
Beginner
1,339 Views
Hmmmm... Just tested: The problem still occurs even if I completely destroy and recreate the decoder. My current best guess is that the decoder maybe doesn't like getting a non-key frame as the first frame to decode. Although I would guess that after a seek the splitter would send a key frame first. So I'm not really sure what's going on. Well, I'll leave this to you now. If there's anything I can help with, just let me know.
0 Kudos
IDZ_A_Intel
Employee
1,339 Views
My apologies for the delay.

In response to (1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA:


In this case DecodeHeaderappears to beworking as intended. MFX_ERR_MORE_DATA indicates that DecodeHeader needs more data to proceed. The parser does not have enough information to know that all of the bytes of the NAL unit at the end of the bitstream are present because there is no start code for the next NAL unit yet. (As you know, an AUD is one of many sequences that can indicate the end of one NAL unit and the beginning of another it is not required.) After reading more data the next call to DecodeHeader shouldhave enough information to proceed.

Though classified as an error, MFX_ERR_MORE_DATA might be better described as a status message indicating that something needs to be done in the program (i.e. read more data) before continuing.

We will to continue expanding and improving the SDK documentation to make it as clear as possible. Your feedback is appreciated and we will keep it in mind for future releases.

For this answer I assumed that your question was about general decode behavior. If there are concerns with not parsing a specific file correctly we will be glad to look into this further.





Regarding (2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts:


Weve reproduced the block artifacts and determined that there are some things for us to fix to be compatible with the Haali Media Splitter and other possible VC-1 sources. Updates to the VC-1 decoder filter may be available in a future release, though a fix may not be prioritized immediately. In the meantime, the WM ASF reader was tested more thoroughly as a source for the VC-1 decoder filter sample and may provide better results. As a reminder, these filters are intended as samples and not as production-ready components.

Here is a workaround if you would like to try updating the VC-1 decoder filter yourself: Comment out the copy of m_pVC1SeqHeader into the data buffer passed to m_pDecoder->RunDecode in CVC1DecVideoFilter::Receive (vc1_dec_filter.cpp). In my tests this ended up breaking compatibility with the WM ASF reader but enablingthe Haali Media Splitter pipeline(intelvc1.mkv->Haali Media Splitter->Intel Media SDK VC-1 Decoder->EVR) to work with playback from the beginning as well as arbitrary seeks.



I hope this helps. Please let us know if you have any more questions.

Regards,

Jeff

0 Kudos
madshi_net
Beginner
1,339 Views
Thanks for your reply!

DecodeHeader: In DirectShow the sequence headers are stored as part of the media type information structure. A DirectShow decoder filter has to decide whether to accept or decline a pin connection request, based only on the media type information. The DirectShow filter doesn't have access to the full video bitstream at this point in time. Which means that the DS filter has no way to provide MORE_DATA during pin connection negotiation. If you don't provide a way for DecodeHeader to work with just the sequence headers (as they are stored in the media type information) then every DirectShow filter using the Media SDK will have to either hack around the problem by adding an AUD (and this won't work for MPEG2 and VC-1), or implement its own video bitstream parsing code.

VC-1 decoding/seeking: There seems to be a misunderstanding here. I've reported two different problems:

(1) With some DirectShow splitters, injecting VC-1 headers results in heavy image corruption. This is IMHO a fault of the DirectShow splitters.
(2) After a seek, the Media SDK's VC-1 decoder sometimes produces block artifacts in the first decoded frame.

The bug I'm concerned with is (2) and not (1). If you re-read my earlier comments, I've already explained how to work around problem (1). You're right that commenting out "m_pVC1SeqHeader" fixes (1). But so does remuxing the m2ts file to MKV by using eac3to, as I suggested earlier.

Even after the "m_pVC1SeqHeader" fix, problem (2) still occurs. It is less obvious, though, and doesn't occur with every seek. Problem (2) has nothing to do with your DirectShow VC-1 decoder sample or with the way VC-1 headers are injected. It is a problem with the Media SDK's VC-1 decoder itself.
0 Kudos
IDZ_A_Intel
Employee
1,339 Views

Thanks for your clarification about where the current behavior of DecodeHeader is problematic.We will look deeper into the pin negotiation scenario you've described.

Yes, there aremultiple issues with VC1 decode seeking. We are currently discussing ways to make the filters more general as well as ways to improve our stream reposition testing of the decoders themselves.

Since the DirectShow filters are samples we only expose them to limited testing. In the case of VC-1 our development has onlyfocused onone pipeline:

VC1 in WMV container->(WM ASF Reader)->(MSDK VC-1 Decoder)->(Enhanced Video Renderer)

Thefilter samplesare not intended to be production ready general purpose solutions, though we are interested in making them better.

Leaving aside the seek issues in the VC1 decoder itself (which are being investigated separately), are these adequate summaries ofthe problems you've described so far?

1) You need a way for DecodeHeader towork with meaningful error codes in an environment such as pin negotiation, where the rest of the stream is not available yet.

2) The header injection model currently used by the MSDK VC-1 decode filter (likely an adaptation to theWM ASF Reader) does not work well with other sources.

3) There may be some additional parsing issues, wherewe require abitstream element order not required by the spec.Just to clarify -- have you seen this only with VC-1 or with Mpeg2 and H.264 as well?

Is there anything else?

Wewill probablyhave more questions as we investigate further. Thanks for your help with identifying these issues.

Jeff








0 Kudos
madshi_net
Beginner
1,339 Views
> are these adequate summaries ofthe problems
> you've described so far?

1) and 2): yes.

Personally, I can work around the issues 1) and 2), though. For me the most crucial problem is 3), because I don't know how to work around it.

> 3) There may be some additional parsing issues
> wherewe require abitstream element order not
> required by the spec.

I think 3) is an additional issue, but I'm not sure what the cause it. It's not really a seeking issue, I think. I've tried to completely destroy and recreate all MSDK stuff with every seek and the problem still occurs. So I think the real issue is that sometimes the very first decoded frame shows blocks. Not sure why.

> Just to clarify -- have you seen this only with
> VC-1 or with Mpeg2 and H.264 as well?

I've not seen it with Mpeg2. I do have seen a similar issue with H.264, though. It could actually be exactly the same issue. It's also always the first frame which shows blocks, the very next frame is always perfectly clean.

> Is there anything else?

There might be an MPEG2 problem, totally separate from the issues mentioned so far, but I've not yet analyzed it in more detail. Basically at the start of one MPEG2 sample I'm getting visible artifacts with the Intel software decoder, but not with libav/ffmpeg. But let me double check this before "officially" reporting it as a bug.

> Thanks for your help with identifying these issues.

Thanks for your support - I appreciate it!!

Best regards, Mathias.
0 Kudos
madshi_net
Beginner
1,339 Views
Ok, I've double checked the MPEG2 issue. Here's a sample and two screenshots:

sample: http://madshi.net/intelMpeg2.mkv
screenshot first frame Intel decoder: http://madshi.net/intel.jpg
screenshot first frame libav decoder: http://madshi.net/libav.jpg

I have to say, though, that the Microsoft MPEG2 decoder shows the same artifacts as yours does. So it's possible that there's some kind of problem with the MPEG2 bitstream itself. However, libav/ffmpeg handles this very nicely, with no visible artifacts. I'm not sure if this is worth investigating on your side. I've leave that up to you.
0 Kudos
madshi_net
Beginner
1,339 Views
Sorry for triple posting, but I think I found a good way for you to reproduce the VC-1 problem I'm most concerned with. I've uploaded a new sample for you here:

http://madshi.net/intelvc1.mkv

This MKV seems to start with a key frame. When using the MSDK software decoder, the first 5 decoded frames are these:

frame 1: http://madshi.net/intel1.jpg
frame 2: http://madshi.net/intel2.jpg
frame 3: http://madshi.net/intel3.jpg
frame 4: http://madshi.net/intel4.jpg
frame 5: http://madshi.net/intel5.jpg

As you can see, the first two frames come out garbled. Now the interesting bit: If I playback this video with the Microsoft VC-1 decoder, the first 3 frames are these:

frame 1: http://madshi.net/ms1.jpg == intel3.jpg
frame 2: http://madshi.net/ms2.jpg == intel4.jpg
frame 3: http://madshi.net/ms3.jpg == intel5.jpg

Hope this makes it easier for you to pinpoint the cause of the problem.
0 Kudos
Nina_K_Intel
Employee
1,339 Views
Hi Mathias,

Thanks for such accurate bug reports and help with reproduction. Jeff and I will look into this.

Regards,
Nina
0 Kudos
madshi_net
Beginner
1,339 Views
I've some new information. I've been working on implementing the libav/ffmpeg decoders into my DirectShow renderer, and I've learned a few things that might be helpful for you, too:

(1) In DirectShow splitters often deliver a couple of "preroll" samples with a negative timestamp, before the real seek point sample is delivered with a positive timestamp. The preroll samples are meant to be decoded (e.g. in order to initialize the decoder properly), but they're usually not displayed. Your DirectShow filter samples in the MSDK have this code in frame_constructors.cpp:

rtStart = (rtStart < 0) ? 0 : rtStart;

Basically samples with a negative timestamp are converted to a 0 timestamp. That's a bad idea, because it practically converts all preroll samples to real samples. So the preroll samples will probably be displayed by the video renderer, even though they shouldn't. I would suggest to simply remove the code quoted above. Works fine for me.

(2) The libav decoder also sometimes outputs a first corrupted frame, followed by proper frames, similar to what the MSDK decoder does. The first corrupted frame might even have a positive timestamp. However, the libav decoder marks frames as being key frames or not. I've found that if the first decoded frame is corrupted, it's usually not a key frame. So I've simply added code to my renderer to start displaying frames only with the first decoded key frame which has a positive timestamp. This seems to work very well. Unfortunately I don't see any way with the MSDK to implement a similar solution because I don't see how I could know whether a decoded frame is a key frame or not. So my suggestion is this: At start of decoding and after a reset/seek, either let the MSDK silently drop all decoded frames, until the first key frame with a positive timestamp is decoded. Or alternatively offer a way for MSDK users to find out whether a decoded frame is a key frame or not. E.g. you could add a "bool key_frame" to the mfxFrameInfo structure. Doing either of this might already fix the VC-1 problem "3)" we talked about earlier. Not sure, though. The VC-1 problem might still be something different.
0 Kudos
Nina_K_Intel
Employee
1,339 Views
Hi Mathias,

Great thanks for sharing these additional results! I will check your suggestions and follow up.

Nina
0 Kudos
Nina_K_Intel
Employee
1,339 Views
Hi Mathias,

Sorry for delayed answer, seems we missed your first question behind the reposition discussion - the one about DecodeHeader. I have a suggestion for you - if you are sure that you are providing full frame or full header in the input bitstream you may set the flag DataFlag = MFX_BITSTREAM_COMPLETE_FRAME. Then decoder will not ask for more data. Otherwise it needs next start code to understand that the header is finished.

We will also do our best to include some fixes for the repositioning into our next release.

Thank you,
Nina
0 Kudos
madshi_net
Beginner
1,339 Views
Thanks Nina,

"DataFlag = MFX_BITSTREAM_COMPLETE_FRAME" sounds like a good solution to me!

Looking forward to the fixes you're mentioning. I'm not in a hurry, though, so no problem if it takes some time.

FWIW, I've one more information to share: While implementing the libav/ffmpeg MPEG2 decoder, I've found that some transport streams have pretty bad (swapped) timestamps, resulting in both libav/ffmpeg and MSDK decoders outputting swapped timestamps, too, resulting in non-smooth playback. If you are interesting in this problem, I can provide you with a sample.

Looking through already existing open source MPEG2 decoder implementations, it seems that the usual way to work around this issue is to use only the I-frame timestamps and to ignore (= interpolate) all other timestamps. That's the approach used by ffdshow, at least. For that to work it might be useful for the MSDK user to be able to find out which decoded frame is an I-frame and which is not. Or alternatively, you could probably also do the timestamp dropping & interpolation inside of MSDK, if you prefer that. Or maybe MSDK users could already solve this right now by feeding only I-frame timestamps to MSDK and feeding e.g. "-1" for P-frame and B-frame timestamps to MSDK? Would the MSDK then already do the interpolation for the missing timestamps? Haven't tried that yet...

Best regards, Mathias.
0 Kudos
Markus_Pingel
New Contributor I
1,339 Views
Why should the MSDK mess with the time stamps? Normally, a decoder core does not need to care about external timestamps, it just decodes the frames. You can decode a H264 elementary stream with the MSDK and there is obviouslyno timestamp info from a container header.
I think the MSDK just passes the time stamps you set.
0 Kudos
madshi_net
Beginner
1,339 Views
@Markus, there's a difference between decoding order and presentation order. Video frames are usually not decoded in presentation order, but in decoding order. You feed MSDK (or libav/ffmpeg) with the bitstream in decoding order, but with presentation timestamps. Because decoding and presentation order differs, the presentation timestamps you feed MSDK/libav/ffmpeg with are not continuous. However, you need to feed the video renderer with frames in presentation order and with continuous presentation timestamps. The frame reordering is done by MSDK/libav/ffmpeg, including reordering of timestamps. And this already works just fine in most cases, just not with some MPEG2 transport streams, which sometimes have screwed up container timestamps.
0 Kudos
Markus_Pingel
New Contributor I
1,339 Views
I know the difference between decoding and presentation order, but that order is not defined by the container timestamp, but by the gop structure.The MSDK decoder like most decoders deliver the frames in presentation order as it is derived from the elementary stream.

If you don't believe it, just set the timestamp you pass to DecodeFrameAsync in mfxBitstream to 0 (or any other value)for all frames. And by some miracle from Intel, they come out in presentation order and mfxFrameSurface1.Data.TimeStamp has the same value for all frames.

So again, I would rather not have the MSDK mess with the TimeStamp.
0 Kudos
madshi_net
Beginner
1,339 Views
Personally, I don't care much if the MSDK "messes" with the timestamps or not. All I care about is that there must be *some* solution for me to achieve smooth playback, even with broken transport streams. FWIW, in order to get continuous timestamps from the MSDK, I need to feed the MSDK with presentation timestamps for MPEG2 and h264, but with decoding timestamps for VC-1. If I feed the MSDK VC-1 decoder with presentation timestamps, I get swapped timestamps back from MSDK output. This is compatible behaviour to the Microsoft VC-1 decoder, though, so it's probably "good" this way. Anyway, it seems that the MSDK already does some timestamp processing, or "messing", as you call it. But as I said, all I care about is that there should be some way for me to achieve smooth playback. With libav/ffmpeg that's possible because libav/ffmpeg reports which decoded frame is an I-frame, so I can post process the timestamps accordingly. This doesn't seem to be possible with the MSDK at the moment.
0 Kudos
Markus_Pingel
New Contributor I
1,302 Views

If the container timestamps are wrong, but the stream does not have gaps, the safest way is to calculate the presentation timestamps yourself by counting the decoded frames and calculate the render time with the framenumber and the frame rate. You have to reset the frame count and possibly a start time offset on the appropriate DirectShow calls.

0 Kudos
Reply