With the Media SDK 2012, it really seems to me as though for most H.264 streaming video sources, it's not possible to, off the bat, decode one frame at a time, even when MFX_BITSTREAM_COMPLETE_FRAME flag is set and the AsyncDepth video param is set to 1.
After processing the header from a video stream, if I try to decode one I-Frame, then I get MFX_ERR_MORE_DATA.
The behavior I see if I let it advance the DataOffset value and modify the DataLength as it desires is that after it decodes the header, then it moves the DataOffset back to the beginning of the stream. When DecodeFrameAsync is called the first time, it skips (the DataOffset) past the header and IFrame in the bitstream and requests more surfaces (which is fine to request -- but why move the DataOffset amount and adjust the DataLength if it didn't decode because it needed more surfaces?). Then the next time DecodeFrameAsync is called (with sufficient surfaces available), it finally starts decoding -- the next frame, not the first one.
If I try to get it to decode an IFrame with no data after it, with sufficient surfaces, it gives MFX_ERR_MORE_DATA forever. It seems as though it has to have more than one frame in the bitstream for it to be willing to do any decoding.
Is it really not possible to decode individual frames, as is claimed, or am I likely missing something? Any idea what it might be?
There should be no issues decoding one frame at a time using the MFX_BITSTREAM_COMPLETE_FRAME method. This approach is illustrated in the SDK "sample_decode" sample. However, that approach is somewhat artificial since it does not include a demuxer. For a more realistic per-frame decode scenario, take a look at the Media SDK/FFmpeg integration code part of the Media SDK tutorial, here: http://software.intel.com/en-us/articles/intel-media-sdk-tutorial-simple-6-transcode-opaque-async-ffmpeg
If using the MFX_BITSTREAM_COMPLETE_FRAME flag you do not have to insert any "padding" after the frame. The bitstream buffer just need to contain one complete frame.
It's good to have it confirmed that it should be possible. So if I initialized the decoder with AsyncDepth = 1, have successfully called DecodeHeader, and I'm certain I have an entire IFrame in the buffer (with MFX_BITSTREAM_COMPLETE_FRAME set), but still get MFX_ERR_MORE_DATA when I call DecodeFrameAsync, do you have any idea what I might be doing wrong?
I'll note that in order to decode the SPS/PPS header information, in isolation, I do seem to have to append a fake frame after it -- presumably so that the decoder knows where the header ends. If this is a problematic method, do let me know.
I've tested it with various video sources, including ones encoded by the Intel Media SDK. As I've mentioned, the IMSDK does decode them, but it refuses to return a successful decode status (or set the sync point) unless I continue adding frames to the bitstream buffer. If I keep trying to call DecodeFrameAsync on each frame (copying them individually to the buffer and adjusting the bitstream DataLength and DataOffset(=0) according), it keeps resulting in MFX_ERR_MORE_DATA.
Latency is crucial to our application, so any further insight you could give or direction you could point me would be appreciated. I switched to the Media SDK 2013 in case something was improved about that, but I'm seeing the same results. I noticed that the sample_decode with this version has changed slightly to always set MFX_BITSTREAM_COMPLETE_FRAME, but regardless, if it gets MFX_ERR_MORE_DATA it simply continues to append more frames, which is what I'm trying to avoid doing. I haven't looked closely at the Media SDK/FFmpeg integration code, and demuxing is not an issue for me. But I'll check it out to see if it offers any further insight.
Right, it's a known limitation that you have to add some "padding" after SPS/PPS during the call to DecodeHeader. It's been discussed in a few earlier forum posts. However, not that you only have a add a few bytes. http://software.intel.com/en-us/forums/topic/328668
I suspect the frame in the bit stream buffer is somehow not complete? And the DataFlag is set to MFX_BITSTREAM_COMPLETE_FRAME always, right?
The default behavior of "sample_decode" is to read large chunks of data into the bit stream buffer. For low latency and pre-frame processing this is not what you want. If you look at the "low latency" option of the sample you can see one way to implement this. Again, this is quite artificial since there is no demuxer involved.
Note that for both sample_decode (with low latency option) and the Media SDK/FFmpeg interoperability sample part of the tutorial you should not see any MFX_ERR_MORE_DATA return code when calling DecodeFrameAsync.
Thanks for the quick reply.
Yes, I'd come across another post on the DecodeHeader subject, and the 0x00 0x00 0x00 0x01 0x65 NAL sequence you cite in the linked post is in fact exactly what I'm currently using. It decodes the header fine with that appended -- I just wanted to make sure that doing so wasn't somehow causing trouble with the ensuing frame-by-frame decoding attempt.
And right, I recognized the various possible options with which "sample_decode" runs. But I hadn't noticed before that there's a different incarnation of ReadNextFrame for the low latency case, and it does look like it's only feeding in one frame at a time, and waiting for the previous frame to be decoded before copying another into the bitstream being decoded. So since that case is shown to work, I suppose I can't think of anything to do except make absolutely certain that my frames are indeed complete (since yes, with great paranoia I am setting MFX_BITSTREAM_COMPLETE_FRAME every time before calling DecodeFrameAsync).
I will post back if and when I discover more about what's going on, for posterity (or further guidance).
Ok, I believe I've figured out my issue: I didn't realize that after DecodeHeader is called on SPS/PPS information in an mfxBitstream, that (apparently) DecodeFrameAsync() also has to be called on that SPS/PPS information!
Perhaps it seems obvious to you that this is the case, but it wasn't to me. I assumed that after a successful call to DecodeHeader (with the resulting mfxVideoParams of course being used to call Init() on the Decoder object), that the header was "decoded", and that I could subsequently begin feeding complete I and P frames to DecodeFrameAsync().
The unending MFX_ERR_MORE_DATA result I kept getting made me think it wanted more actual video frame data -- not that it was eternally pining for header info in the bitstream (which I thought it had already gotten via DecodeHeader). If I might make a suggestion, aside from perhaps documenting this more clearly somewhere (and I apologize if it is and I missed it), I think an MFX_ERR_NEED_HEADER or something would be appropriate, and much more clear in this case.
If, as in the examples, a user simply appends to an mfxBitstream and lets the calls to DecodeHeader() and DecodeFrameAsync() manage the DataOffset and DataLength values, then this detail is not important. But if one is attempting to manage the data in the bitstream manually during decoder initialization, then this is apparently vital to be aware of.
Thanks for answering my slightly-misguided questions, so that I could eventually narrow and finally pin down the source of my problems.
Yes, I believe you are correct in your statement that the first call to DecodeFrameAsync() requires the SPS/PPS in the buffer. We apologize for not clarifying this sufficiently. We will work on improving the documentation for the next release.