Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.
3068 Discussions

DecodeFrameAsync returning stagnate timestamps for certain files

jamesterm
Beginner
3,491 Views

These flash files:
http://blog.iamjvn.com/2011/03/san-antonio-2011-video.html#more

Exhibit stagnate timestamp behavior when calling DecodeFrameAsync(). I am using 2.0.12.24071 (2.0 gold)

The test case looks like this:
sts = m_pmfxDEC->DecodeFrameAsync(&m_mfxBS, &(m_pmfxSurfaces[nIndex]), &pmfxOutSurface, &syncp);

.
. (same is in pipeline sample code)
.

if (MFX_ERR_NONE == sts)
...
sts=WriteFrame(pmfxOutSurface);

What happens is thatprior toa *MFX_WRN_DEVICE_BUSY interval in the m_mfxBS structure the DataLength queues up about 4 frames, when this happensand as the loop starts to process these... the pmfxOutSurface->Data.TimeStamp will have the last pts for all 4 frames... so for example:

if the m_pmfxSurfaces[] should have had
[10] = 936
[11] = 969
[12] = 1003
[13] = 1036
[14] = 1070
[15] = 1103

This is what I get
[10] = 936 (good)
[11] = 1070 (bad)
[12] = 1070
[13] = 1070
[14] = 1070
[15] = 1103

Based from this forum entry
http://software.intel.com/en-us/forums/showthread.php?t=70446

I'd expect mpfxOutSurface->Data.TimeStamp to work properly, and the code in here is exact to the example. Is there something that I am missing? Are these files correct? They work with all other decoders including ffmpeg. Is this PTS duplication expected behavior? Is there a way to make this work? I sure would hate to resolve to hacking a solution! I hope to hear a reply. I hope Nina gets this message.



*MFX_WRN_DEVICE_BUSY- The way I determined this to be true is that I can manipulate the time by injecting a sleep... the actual queuing of the frames may take place in a different frame range interval, and the same set of frames that I test (over and over again)succeed just fine.
0 Kudos
39 Replies
Nina_K_Intel
Employee
2,003 Views
Hi James,

Could you try processing the MFX_WRN_DEVICE_BUSY status by simply waiting and repeating the same call of DecodeFrameAsync (without making changes to the input bitstream)? In your case you seem to read more data into the input bitstream and increase the bitstream.TimeStamp each time.

If you operate with timestamps you should not store more than 1 frame in the input bitstream becausempfxOutSurface->Data.TimeStamp is determined based onm_mfxBS.TimeStamp.

Please let us know wether this helps or not.

Regards,
Nina
0 Kudos
jamesterm
Beginner
2,003 Views

I've attached a code snip of the current function as it is virtually identical to the pipeline sample code as it (to the best of my knowledge) upon receiving MFX_WRN_DEVICE_BUSY will wait and repeat the same call of DecodeFrameAsync without making changes to the input bitstream. It should be noted that the accumulation within the bitstream happens BEFORE I saw this message, and just right before (as this was consistent).

I never store more than 1 frame into the input bitstream, but rather this happens implicitly because the frame will not process until it gets the 3 or 4 future frames accumulated into the bitstream. This "implicit" functionality is identical to the CSmplBitstreamReader::ReadNextFrame(), where the DecodeFrameAsync() manages the DataLength, and DataOffset accordingly.

The key here is why does it not always transfer the input bitstream packet to DataOffset consistently when I call DecodeFrameAsync, is this expected behavior?It willskip thistransfer and instead add toDataLengthabout every 10 frames or so and not necessarily in the same group of frames. The rest of what you said in regards to one frame / timestampin bitstream per successful DecodeFrameAsync()frame creationmakes sense as I now can see how to safely create a work-around.

For now I propose to make a work-around by assigning the *correct* time-stampprior to the DecodeFrameAsync call. I can determine this by keeping a count of how many frames have accumulated in the bitstream per frame session. This probably is not the cleanest solution but should be effective.

Thanks for telling me when the timestamp gets assigned to the mpfxOutSurface->Data. I'll post back if this works. Let me know if you have any insight on why theinput bitstreamframes accumulate like this.

-James
0 Kudos
jamesterm
Beginner
2,003 Views

Here is a solution that is working against all my test media files. This should illustrate the problem, but also it would be good to review as I am making some assumptions on the behavior of the inputbitstream. The idea here is when the input bitstream queue's up, so does the timestamps. Ideally, it would be great if the SDK could manage this queue for me... assuming this can be expected behavior.


-James

0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James,

I've studied the code you provided and I saw that processing of DEVICE_BUSY status is not the reason of the problem as I thought before.

I'm really puzzled with the issue you observe, it doesn't look as correct behaviour. The workaound is fine but it essentially means bypassing of Media SDK in terms of timestamps setting. And as you say the SDK must be able to handle timestamps properly - it is designed to do so.
Does the AVFrame *picture constain whole frame each time?Could you debug to see which status is returned from DecodeFrameAsync when bitstream accumulation happens? Decoder would return MORE_DATA if there's not enough data to decode a frame. And in any case the data shouldn't accumulate in the bitstream, decoder can leave only a few irrelevant bytes.
Below is the description of the expected decoder behaviour:
"
The input bitstream bs can be of any size. If there are not enough bits to decode a frame, the function returns MFX_ERR_MORE_DATA, and consumes all input bits except if a partial start code or sequence header is at the end of the buffer. In this case, the function leaves the last few bytes in the bitstream buffer. If there is more incoming bitstream, the application should append the incoming bitstream to the bitstream buffer. Otherwise, the application should ignore the remaining bytes in the bitstream buffer and apply the end of stream procedure described below.
If more than one frame is in the bitstream buffer, the function decodes until the buffer is consumed. The decoding process can be interrupted for events such as if the decoder needs additional working buffers, is readying a frame for retrieval, or encountering a new header. In these cases, the function returns appropriate status code and moves the bitstream pointer to the remaining data.
It is recommended that the application invoke the function repeatedly until the function returns MFX_ERR_MORE_DATA, before appending any more data to the bitstream buffer.
"
It would be great if you could model your application behaviour with sample_decode or maybe DirectShow filters. I need a reproducer to understand what exactly is happenning. Could you please try?
Btw, you shouldn't alter working surface timestamps if the surface is locked - and likely it would be locked after DecodeFrameAsync code. But this might be not relevant to the problem - just for your information.
Best regards,
Nina
0 Kudos
jamesterm
Beginner
2,003 Views
"
Does the AVFrame *picture constain whole frame each time?
"
Yes, in fact one of the tests that I have done is physically copy the bitstreammemory of a time it *fails against the time it succeeded to verify they were identical to exonerate the FLV muxer.

*fails - meaning it accumulated when it should have consumed it.


"
Could you debug to see which status is returned from DecodeFrameAsync when bitstream accumulation happens?
"
It always returns MFX_ERR_NONE for all 4 frames(just verified as I type this)


"
And in any case the data shouldn't accumulate in the bitstream, decoder can leave only a few irrelevant bytes
"
For now of all the files I have tested (e.g. other flash files, mp4, m2t) I have not seen this problem there is something about these group of files (see initial post with link)


"
It would be great if you could model your application behaviour with sample_decode or maybe DirectShow filters. I need a reproducer to understand what exactly is happenning. Could you please try?
"

I have modeled the application as close to the sample_decode as much aspossible. If you need help in getting together some code to reproduce let me know. I am not using any direct-show code.


"
Btw, you shouldn't alter working surface timestamps if the surface is locked - and likely it would be locked after DecodeFrameAsync code. But this might be not relevant to the problem - just for your information.
"

I presume you saw this comment:
//This serves no purpose but is very useful for debugging

I have commented this out by default inmy currentrunning build... I just used it temporarily to help read what was going on during this issue.

-James
0 Kudos
jamesterm
Beginner
2,003 Views

I got to thinking that there may be some other variables involved which may make it harder to reproduce on your end (e.g. subtle difference in avcc->annex-b conversion). As a fall-back I could submit a raw element dump in annex-b form which should be reproduceable with the sample_decode executablewith the exception that the read next frame may not advance in frame packets (this may be necessary). Let me know if we need to go that route. basically for me the ffmpeg flv demux will present packets (i.e. the entire frame) the audio is pruned out to ACC codec, and the video first gets converted to annex-b and then submitted to the DecodeFrameAsync(). The conversion to annex-b is straight forward as it simply converts the entire frame. The logic for appending the frames is identical to the sample_decode in regards to interpreting the DataLength and DataOffset.

0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James!

I think I now know what is wrong - you see, according to the specification (and Media SDK developers confirmed), the situation when decoder returns MFX_ERR_NONE and doesn't consume the input bits is totally legal. That's why the manual "recommends" to append data to bitstream only when decoder explicitly requests MFX_ERR_MORE_DATA. This recommendation is crucial for the application which deal with timestamps.And that's exactly your case. Could you try invoking your "ReadNextFrame" only if MFX_ERR_MORE_DATA is returned? This should fix the problem.

On the other point, I would say that this behavior is not really well described in the Media SDK manual so I will work with the team to improve the documentation.

Thank you for tracking down this detail.

Best regards,
Nina
0 Kudos
jamesterm
Beginner
2,003 Views

Yes this explains the issue and I have "partially" confirmed that it works. I say partially as my workflow iscurrentlyincompatibleto this suggested change given itsdemux environment. My actual solution would involve either queueing timestamps (as I currently do)or queueing extended locked surfaces. I agree with the suggested action item listed above. Thanks for your help on this matter.

0 Kudos
jamesterm
Beginner
2,003 Views
I needed a little bit of time to think this through. I'd like to start out by saying while the time stamp solution I presented earlier does work, it is a bit fragile and I would like to end this discussion with something a a bit more solid and robust.

Also I want to present a birds eye view of my work flow here in hopes that it gives perspective on how I ran into this issue.

The current workflow in a simple model is the case where I simply wish to obtain the next frame sequentially as if I were to play the video:

UncompressedFrame=ReadNextFrame()

Inside this function it looks something like this:

while (!UncompressedFrame)
UncompressedFrame=decode (next compressed frame);

This is a real over-simplified model that is only somewhat accurate, where it always gives a compressed frame to get an uncompressed frame. With some mpeg2 type codecs this model could work around p and b frames by keeping a lean queue internally.

As we see with this model using the intel codec there are some times when we need some form ofinput queue controluntil the codec says it is ready for more compressed frames.

I propose the solution be a "smart" input bit stream that can manage this for me. I'dalso attempt toencapsulate the data length and data offset, wherethe client codewouldn't need to manage them.The interfacewould be as simple as adding "packets" into it, and it work with when to submit to the current working bitstream as well as the time stamps that they corresponded to.

If this solution seems to be in the right direction, and if others could benefit from this let me know, and I'd be happy to submit it.

-James
0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James,

I think your solution would be a great value to our Media SDK developer community. I would appreciate if you post it here on the forum.

Thank you so much for your contribution!

Nina
0 Kudos
jamesterm
Beginner
2,003 Views

Ok will do, but at the moment I am going to have to put this on the back burner as some new h264stress media has come to me which challenges this codec. I'm not going to go into details about it here except to say keep an eye out for me during this week (I'll want to confirm exactly where the problem is before I post). Once I get these resolved (hopefully) I can finish this.

0 Kudos
jamesterm
Beginner
2,003 Views

Attached is a class called BitstreamManager()

It does as I had hoped where the client code need not mess with the internal fields of the bitstream.

Please code review, and let me know if there are any proposed changes.
Thanks.

0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James,

I reviewed the code and have a comment - in AddToBitstream function you check form_mfxBS.DataLength==0 as an indicator of decoder having taken the data from bitstream (only then you add new data). This condition is not quite correct as decoder may take the frame data but leave a few bytes in the bitstream buffer (partial startcode). I would recommend to rely on function return statuses (MFX_ERR_MORE_DATA) - according to spec - rather than bitstream sructure fields value.

Regards,
Nina
0 Kudos
jamesterm
Beginner
2,003 Views
Yes, thatline was making an assumption and thanks for clearing this up with me... so in regards to the "partial startcode" casecould it ever happen if the bitstream submitted to DecodeFrameAsync() always ended on frame bounderies?
0 Kudos
jamesterm
Beginner
2,003 Views
I have one other question
"
may take the frame data but leave a few bytes in the bitstream
"

When this happens will the mfxBS.DataLength = 0?
0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James,

I made a small investigation regarding your questions. In the default mode, even if you submit frame-wise bitstreams, decoder may leave several bytes which look like a part of start code (even if they do not actually belong to a startcode). But there is another mode, which is actulally preferred from the performance standpoint, when you explicilty inform decoder that you will feed whole frames - for that you need to set the flag on the bitstream DataFlag = MFX_BITSTREAM_COMPLETE_FRAME.

Regards,
Nina
0 Kudos
Nina_K_Intel
Employee
2,003 Views
If decoder leaves few bytes in the bitstream DataLength will not be 0 but will be equal to that "few" value.
0 Kudos
jamesterm
Beginner
2,003 Views
Thanks so much formentioning MFX_BITSTREAM_COMPLETE_FRAME mode. I have had a chance to test it, and formy MP4, flash clips collection where we convert to annexb. This was successful, unfortunately clips from Cannon Vixia camera's fail. :(Cannon Vixiaclips are natively annex-b and self embed the sps pps within the frames. When these fail they show a primary blank grey canvas with several macro blocks of real video flashing. It would be nice to know if anyone else can reproduce this with these files. Finally just to be clear, cannon vixia files work fine if I do not use this flag.

Let's talk a minute about the case where a frame gets consumed and leaves several bytes that look like a part of a start code. If I loop again andcall DecodeFrameAsync with just this... is it safe to say that it should have yielded a return MFX_ERR_MORE_DATA? And would it still keep this memory in the bitsteam? The follow up to this would be what would be the consequence ofsubmitting these bytes, with the next frame appendedand calling DecodeFrameAsync()? I'd like to find such a file and test this. The reason why I have that logic in there is that every file I have (except for the flash files in this case)... consume on the first call to DecodeFrameAsync()... this means I save an extra memcopy of the input stream for most of the clips I have tested. Yes, now days it's probably not a significant performance gain, but its the idea of saving extra work that has me fighting to keep it in our code. ;)
0 Kudos
Nina_K_Intel
Employee
2,003 Views
Hi James,

I think the problem with Cannon Vixia files you mention is exactly in SPS/PPS headers. If you set the flag COMPLETE_FRAME you need to feed only frame data (or full startcodes and full sps/pps). Those few bytes that decoder may leave in regular mode can be not only the partial startcode but also a partial sequence header. If you set the flag and feed data with header decoder seems to simply use parts of headers for decoding.

If you loop again after bytes are left (decoder already returns MORE_DATA when leaving those bytes) decoder will return the MORE_DATA again. You should append new data to the bitstream, those few bytes will get consumed only with the new portion of data. The main idea is to not break data continuity, which means you should not remove any bytes from bitstream.

At this stage my general recommendation will be to rely on spec and function return statuses rather than investigating these complicated details. You may still miss some corner cases. Programming based on return statuses is way simpler and more reliable.

Nina
0 Kudos
jamesterm
Beginner
1,984 Views
Thanks for the quick turn-around reply :)

I agree with your standing on sticking with the spec, as I'd probably do the same if I were you, and therefore agree this aspect of the case is closed. The questions you have answered for me have been of great value to me and our company, and I'll take responsibility for any corner cases that may crop up.


I am concerned about the Cannon Vixia, as I have spent the past hour verifying exactly what I have submitted to DecodeFrameAsync (i.e. full complete frames). Let me know if we should open a separate case for this.
0 Kudos
Reply