(1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA when being called with the following h264 header sequence:
00 00 00 01 67 64 00 33 AC 34 E6 02 C0 49 FB 84 00 00 0F A0 00 03 0D 42 3C 60 C6 68
00 00 00 01 68 EE BC B0
However, if I just add an AUD (00 00 00 01 09 10) to the end of that data, "DecodeHeader" is happy to do its job. Your own "MP4/AVC Decode using the Intel Media SDK" article/demo stumbles over the same problem. In that demo DecodeHeader always fails.
(2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts. The blocks are usually in areas where there's strong motion. The very next frame is clean. It's only the first frame which shows these artifacts. And it doesn't always occur, but very often. It occurs with and without "injected headers". If you need a sample to reproduce this problem, just let me know. I've seen the same problem with 2 very different files, though (one SD interlaced, the other Blu-Ray 24p), so I'm pretty sure it will happen with all files.
Here's a screenshot of the problem:
We have recieved your questions and are looking into them. For your second question on VC1 video seek, could you provide the sample you mentioned? It might also help to know a little more about howthe file was created(if there were any additional steps on your part) andwhich players display the problem. Also, do you see the same artifacts if you decode straight to a YUV?
thanks for the quick reply. The Intel Media SDK support is second to none!! :)
To answer your questions:
(1) The file is an interlaced SD m2ts, directly taken from a Blu-Ray. Of course I've verified that the problem only occurs with the Intel decoder. It does not occur with the Microsoft VC-1 decoder, so the file doesn't seem to be damaged.
(2) Here's a sample. I've used a hexeditor to cut out a section of the file (of course I've left the 192 byte m2ts container structure intact). Please let me know when you've downloaded the sample, so I can remove it from my server again. Thanks:
(3) The problem occurs with MPC-HC, PotPlayer and also with GraphEdit. So it's not player related.
(4) Most DirectShow splitters don't handle interlaced VC-1 streams in a way that the Intel decoder likes. As a result, if headers are injected after a seek, heavy image corruption can occur. I've worked around this issue by remuxing the m2ts file into an MKV file ("eac3to intelvc1.m2ts 1: intelvc1.mkv") and then splitting the MKV file with the Haali MKV Splitter. This way the Intel decoder is happy with the splitter output and injecting the headers doesn't have any negative side effects, anymore.
(5) I've been testing with my own VC-1 DirectShow decoder using the Media SDK. But the same problems also occur with the official Intel VC-1 DirectShow filter which is installed by the Media SDK. In GraphEdit: "Haali Media Splitter (intelvc1.mkv) -> Official Media SDK Intel VC-1 Decoder -> Video Renderer".
Of course you can also use a different splitter, but I've found that using anything other than described above increases the problems instead of reducing them. The Intel VC-1 decoder seems to be very picky about which bitstream elements are grouped together for one "Decode" call. Which is ok, I guess, it just makes testing more complicated.
In order to reproduce the problem, I'd suggest to remux the m2ts file to MKV, then use GraphEdit, with "Haali Media Splitter (mkv) -> Intel VC-1 DirectShow decoder -> Video Renderer". Start playback, then pause playback, then in paused state seek around. Sometimes you'll get a blocky picture, sometimes not.
> do you see the same artifacts if you decode straight to a YUV?
I've tested with 2 different video renderers, with NV12 output from the Intel decoder. The problem occurred with both renderers.
In response to (1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA:
In this case DecodeHeaderappears to beworking as intended. MFX_ERR_MORE_DATA indicates that DecodeHeader needs more data to proceed. The parser does not have enough information to know that all of the bytes of the NAL unit at the end of the bitstream are present because there is no start code for the next NAL unit yet. (As you know, an AUD is one of many sequences that can indicate the end of one NAL unit and the beginning of another it is not required.) After reading more data the next call to DecodeHeader shouldhave enough information to proceed.
Though classified as an error, MFX_ERR_MORE_DATA might be better described as a status message indicating that something needs to be done in the program (i.e. read more data) before continuing.
We will to continue expanding and improving the SDK documentation to make it as clear as possible. Your feedback is appreciated and we will keep it in mind for future releases.
For this answer I assumed that your question was about general decode behavior. If there are concerns with not parsing a specific file correctly we will be glad to look into this further.
Regarding (2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts:
Weve reproduced the block artifacts and determined that there are some things for us to fix to be compatible with the Haali Media Splitter and other possible VC-1 sources. Updates to the VC-1 decoder filter may be available in a future release, though a fix may not be prioritized immediately. In the meantime, the WM ASF reader was tested more thoroughly as a source for the VC-1 decoder filter sample and may provide better results. As a reminder, these filters are intended as samples and not as production-ready components.
Here is a workaround if you would like to try updating the VC-1 decoder filter yourself: Comment out the copy of m_pVC1SeqHeader into the data buffer passed to m_pDecoder->RunDecode in CVC1DecVideoFilter::Receive (vc1_dec_filter.cpp). In my tests this ended up breaking compatibility with the WM ASF reader but enablingthe Haali Media Splitter pipeline(intelvc1.mkv->Haali Media Splitter->Intel Media SDK VC-1 Decoder->EVR) to work with playback from the beginning as well as arbitrary seeks.
I hope this helps. Please let us know if you have any more questions.
DecodeHeader: In DirectShow the sequence headers are stored as part of the media type information structure. A DirectShow decoder filter has to decide whether to accept or decline a pin connection request, based only on the media type information. The DirectShow filter doesn't have access to the full video bitstream at this point in time. Which means that the DS filter has no way to provide MORE_DATA during pin connection negotiation. If you don't provide a way for DecodeHeader to work with just the sequence headers (as they are stored in the media type information) then every DirectShow filter using the Media SDK will have to either hack around the problem by adding an AUD (and this won't work for MPEG2 and VC-1), or implement its own video bitstream parsing code.
VC-1 decoding/seeking: There seems to be a misunderstanding here. I've reported two different problems:
(1) With some DirectShow splitters, injecting VC-1 headers results in heavy image corruption. This is IMHO a fault of the DirectShow splitters.
(2) After a seek, the Media SDK's VC-1 decoder sometimes produces block artifacts in the first decoded frame.
The bug I'm concerned with is (2) and not (1). If you re-read my earlier comments, I've already explained how to work around problem (1). You're right that commenting out "m_pVC1SeqHeader" fixes (1). But so does remuxing the m2ts file to MKV by using eac3to, as I suggested earlier.
Even after the "m_pVC1SeqHeader" fix, problem (2) still occurs. It is less obvious, though, and doesn't occur with every seek. Problem (2) has nothing to do with your DirectShow VC-1 decoder sample or with the way VC-1 headers are injected. It is a problem with the Media SDK's VC-1 decoder itself.
Thanks for your clarification about where the current behavior of DecodeHeader is problematic.We will look deeper into the pin negotiation scenario you've described.
Yes, there aremultiple issues with VC1 decode seeking. We are currently discussing ways to make the filters more general as well as ways to improve our stream reposition testing of the decoders themselves.
Since the DirectShow filters are samples we only expose them to limited testing. In the case of VC-1 our development has onlyfocused onone pipeline:
VC1 in WMV container->(WM ASF Reader)->(MSDK VC-1 Decoder)->(Enhanced Video Renderer)
Thefilter samplesare not intended to be production ready general purpose solutions, though we are interested in making them better.
Leaving aside the seek issues in the VC1 decoder itself (which are being investigated separately), are these adequate summaries ofthe problems you've described so far?
1) You need a way for DecodeHeader towork with meaningful error codes in an environment such as pin negotiation, where the rest of the stream is not available yet.
2) The header injection model currently used by the MSDK VC-1 decode filter (likely an adaptation to theWM ASF Reader) does not work well with other sources.
3) There may be some additional parsing issues, wherewe require abitstream element order not required by the spec.Just to clarify -- have you seen this only with VC-1 or with Mpeg2 and H.264 as well?
Is there anything else?
Wewill probablyhave more questions as we investigate further. Thanks for your help with identifying these issues.
> you've described so far?
1) and 2): yes.
Personally, I can work around the issues 1) and 2), though. For me the most crucial problem is 3), because I don't know how to work around it.
> 3) There may be some additional parsing issues
> wherewe require abitstream element order not
> required by the spec.
I think 3) is an additional issue, but I'm not sure what the cause it. It's not really a seeking issue, I think. I've tried to completely destroy and recreate all MSDK stuff with every seek and the problem still occurs. So I think the real issue is that sometimes the very first decoded frame shows blocks. Not sure why.
> Just to clarify -- have you seen this only with
> VC-1 or with Mpeg2 and H.264 as well?
I've not seen it with Mpeg2. I do have seen a similar issue with H.264, though. It could actually be exactly the same issue. It's also always the first frame which shows blocks, the very next frame is always perfectly clean.
> Is there anything else?
There might be an MPEG2 problem, totally separate from the issues mentioned so far, but I've not yet analyzed it in more detail. Basically at the start of one MPEG2 sample I'm getting visible artifacts with the Intel software decoder, but not with libav/ffmpeg. But let me double check this before "officially" reporting it as a bug.
> Thanks for your help with identifying these issues.
Thanks for your support - I appreciate it!!
Best regards, Mathias.
screenshot first frame Intel decoder: http://madshi.net/intel.jpg
screenshot first frame libav decoder: http://madshi.net/libav.jpg
I have to say, though, that the Microsoft MPEG2 decoder shows the same artifacts as yours does. So it's possible that there's some kind of problem with the MPEG2 bitstream itself. However, libav/ffmpeg handles this very nicely, with no visible artifacts. I'm not sure if this is worth investigating on your side. I've leave that up to you.
This MKV seems to start with a key frame. When using the MSDK software decoder, the first 5 decoded frames are these:
frame 1: http://madshi.net/intel1.jpg
frame 2: http://madshi.net/intel2.jpg
frame 3: http://madshi.net/intel3.jpg
frame 4: http://madshi.net/intel4.jpg
frame 5: http://madshi.net/intel5.jpg
As you can see, the first two frames come out garbled. Now the interesting bit: If I playback this video with the Microsoft VC-1 decoder, the first 3 frames are these:
frame 1: http://madshi.net/ms1.jpg == intel3.jpg
frame 2: http://madshi.net/ms2.jpg == intel4.jpg
frame 3: http://madshi.net/ms3.jpg == intel5.jpg
Hope this makes it easier for you to pinpoint the cause of the problem.
(1) In DirectShow splitters often deliver a couple of "preroll" samples with a negative timestamp, before the real seek point sample is delivered with a positive timestamp. The preroll samples are meant to be decoded (e.g. in order to initialize the decoder properly), but they're usually not displayed. Your DirectShow filter samples in the MSDK have this code in frame_constructors.cpp:
rtStart = (rtStart < 0) ? 0 : rtStart;
Basically samples with a negative timestamp are converted to a 0 timestamp. That's a bad idea, because it practically converts all preroll samples to real samples. So the preroll samples will probably be displayed by the video renderer, even though they shouldn't. I would suggest to simply remove the code quoted above. Works fine for me.
(2) The libav decoder also sometimes outputs a first corrupted frame, followed by proper frames, similar to what the MSDK decoder does. The first corrupted frame might even have a positive timestamp. However, the libav decoder marks frames as being key frames or not. I've found that if the first decoded frame is corrupted, it's usually not a key frame. So I've simply added code to my renderer to start displaying frames only with the first decoded key frame which has a positive timestamp. This seems to work very well. Unfortunately I don't see any way with the MSDK to implement a similar solution because I don't see how I could know whether a decoded frame is a key frame or not. So my suggestion is this: At start of decoding and after a reset/seek, either let the MSDK silently drop all decoded frames, until the first key frame with a positive timestamp is decoded. Or alternatively offer a way for MSDK users to find out whether a decoded frame is a key frame or not. E.g. you could add a "bool key_frame" to the mfxFrameInfo structure. Doing either of this might already fix the VC-1 problem "3)" we talked about earlier. Not sure, though. The VC-1 problem might still be something different.
"DataFlag = MFX_BITSTREAM_COMPLETE_FRAME" sounds like a good solution to me!
Looking forward to the fixes you're mentioning. I'm not in a hurry, though, so no problem if it takes some time.
FWIW, I've one more information to share: While implementing the libav/ffmpeg MPEG2 decoder, I've found that some transport streams have pretty bad (swapped) timestamps, resulting in both libav/ffmpeg and MSDK decoders outputting swapped timestamps, too, resulting in non-smooth playback. If you are interesting in this problem, I can provide you with a sample.
Looking through already existing open source MPEG2 decoder implementations, it seems that the usual way to work around this issue is to use only the I-frame timestamps and to ignore (= interpolate) all other timestamps. That's the approach used by ffdshow, at least. For that to work it might be useful for the MSDK user to be able to find out which decoded frame is an I-frame and which is not. Or alternatively, you could probably also do the timestamp dropping & interpolation inside of MSDK, if you prefer that. Or maybe MSDK users could already solve this right now by feeding only I-frame timestamps to MSDK and feeding e.g. "-1" for P-frame and B-frame timestamps to MSDK? Would the MSDK then already do the interpolation for the missing timestamps? Haven't tried that yet...
Best regards, Mathias.
I think the MSDK just passes the time stamps you set.
If you don't believe it, just set the timestamp you pass to DecodeFrameAsync in mfxBitstream to 0 (or any other value)for all frames. And by some miracle from Intel, they come out in presentation order and mfxFrameSurface1.Data.TimeStamp has the same value for all frames.
So again, I would rather not have the MSDK mess with the TimeStamp.
If the container timestamps are wrong, but the stream does not have gaps, the safest way is to calculate the presentation timestamps yourself by counting the decoded frames and calculate the render time with the framenumber and the frame rate. You have to reset the frame count and possibly a start time offset on the appropriate DirectShow calls.