Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

TimeStamp field usage

innes
Beginner
652 Views

Hi,

I am trying to associate private data (such as timestamps) to frames in a decoding application, and I am a bit confused by MSDK behavior on this topic.

My video stream is a h264 encoded stream, (the same I used to play with interlacement options) with hierarchical B-frames. The GOP looks like this :

I P B B B P B B B P B B B P B B B P B B B ...

Just before submitting a frame to DecodeFrameAsync in the sample_decode reference software, I put a simple counter as timestamp :

m_mfxBS.TimeStamp = dwCount++;

dwCount is a static unsigned int variable initialized to 0.

Given the structure of the GOP, I expect to extract the following timestamps from decoded frames :

0 3 2 4 1 7 6 8 5 11 10 12 9 15 14 16 1319 18 20 17 ...

However, I have very different results in pmfxOutputSurface->Data.TimeStamp.

If I run decode_sample h264 -i stream.264 -r (d3d allocations), I have the following sequence :

0 0 0 0 0 0 0 0 0 0 0 0 0 4 2 4 0 8 6 8 6 12 10 16 14 16 14 20 ...

If I run decode_sample h264 -i stream.264 -o out.yuv (ram allocations), I have the following sequence :

0 2 1 3 0 6 5 7 4 10 9 11 8 14 13 15 12 18 17...

This second result seems to be closer (but not exactly that) to the algorithm used by an encoder to associate data with encoded frames than the algorithm used by a decoder. I do not understand the first result, in the d3d case. 

I used sample_decode.exe from the 2013 release, software implementation to run my tests.

Am I missing something about the way to use correctly the TimeStamp field ?

Regards,

0 Kudos
6 Replies
Petter_L_Intel
Employee
652 Views

Hi,

The Intel Media SDK decoder reads several compressed frames from bitstream (if available) when calling DecodeFrameAsync(),so in the default usage mode for sample_decode the selected timestamp may be assigned to several frames, thus leading to the result you are seeing.

To ensure correct timestamp handing, the bit stream container should be configured to indicate to the SDK that it only contains one frame (DataFlag = MFX_BITSTREAM_COMPLETE_FRAME). This is common for most real-life media pipeline implementations, e.g. container->container transcode, in which the case the demuxer will deliver one frame at a time to the decoder.

By studying time stamp order using this approach you will see the ordering matching the expected order depending on GOP pattern.

For an example on how this can be implemented please check out the "-low_latency" option in sample_decode.

Also, for a full end to end transcode pipeline using the same approach, check out the Media SDK-FFmpeg integration tutorial section part of the new comprehensive Intel Media SDK tutorial article. http://software.intel.com/en-us/articles/intel-media-sdk-tutorial 

Regards,
Petter 

0 Kudos
innes
Beginner
652 Views

Hi Petter,

Of course, setting DataFlag to MFX_BITSTREAM_COMPLETE_FRAME makes sense in this case... However, the behavior of timestamp field when this flag is not set seems very disturbing to me. 

Analyzing your ffmpeg integration case, I have one more question : the first step in several sample codes is feeding a temporary buffer with elementary stream (using memmove and memcpy). Assuming that we are holding av packets in memory long time enough during decoding, would it be safe to replace this code :

[cpp]

memmove(pBS->Data, pBS->Data + pBS->DataOffset, pBS->DataLength);
pBS->DataOffset = 0;
memcpy(pBS->Data + pBS->DataLength, packet.data, packet.size);
pBS->DataLength += packet.size;

[/cpp]

by this one :

[cpp]

pBS->DataOffset = 0;
pBS->Data = packet.data;

pBS->DataLength = packet.size;

pBS->MaxLength = packet.size;

[/cpp]

In my application, the demuxer sends a buffer for each frame, and I would like to avoid memcpys. It seems to work, I can manage alignment requirements, but the decoder crashes if I update pBS->MaxLength too. I have to left it to its former value (when I used a temporary buffer too) to make the hole application work.

Regards,

0 Kudos
Petter_L_Intel
Employee
652 Views

Hi,

For the case of using MFX_BITSTREAM_COMPLETE_FRAME, the complete frame will be extracted from bit stream for every call to DecodeFrameAsync() so the memmove is not really necessary. memcpy was used for simplicity reasons. It should be safe to optimize the buffer access in the way you describe. I experimented a bit by modifying MaxLength before each call but I do not encounter any issues. Can you provide some more details about the crash?

Regards,
Petter 

0 Kudos
innes
Beginner
652 Views

Hi Petter,

Before being more explicit about the crash, I wanted to look further in my code... I have no crash anymore. Since this time, I just added code to handle properly alignment of input bitstream buffers. Thus, I guess the issue was a bad alignment in input buffer. My mistake...

Just a comment about the reference manual : I may be missing something, but I did not understand at the first glance that the decoder interpolates PTS. In the mfxBitstream structure, the DecodeTimeStamp is mentionned as calculated by the encoder, but in the mfxFrameData structure, there is no explanation about timestamp computation for the decoder part. I guess the TimeStampCalc field in the mfxInfoMFX structure and the requirement of PTS on a 90 kHZ clock basis can give a hint, but in my opinion, it is hard to know just from the manual in which case timestamp are calculated for sure : decoding, encoding, frame rate conversion... ? However, it is a very interesting and useful feature. I hope you won't receive this comment as a very bad one, because I refer often to the manual and I find it quite good. 

Thanks for your help.

Regards,

0 Kudos
Petter_L_Intel
Employee
652 Views

Thanks for the feedback. We will work on improving future revisions of the Intel Media SDK manual.

0 Kudos
vinay_k_1
Beginner
652 Views

Hi Peter,

        I am trying to extract Closed Caption data from SEI message in AVC stream. Hence i used the sample code (simple_decode project), in which i added following snippet after mfxDEC.DecodeFrameAsync to check on SEI data.

            dec_payload.NumBit = 100;

 

            while(dec_payload.NumBit !=0){

 

            mfxStatus mfx_param_flag = mfxDEC.GetPayload(&ts,&dec_payload);

 

            if((mfx_param_flag == MFX_ERR_NONE) && (dec_payload.Type == 4))

 

                    printf("");

 

            dec_payload.Type =0;

 

            }  

 

 

 

Where added variables are :

 

    mfxPayload dec_payload;

 

    mfxU64 ts;

 

    dec_payload.Data = new mfxU8[1024];

 

    dec_payload.BufSize =1024;

 

    When i check on dec_payload.Data memory by breaking at printf(""); i see different data than that was expected. I am not really sure if there is format issue here.

Although i am getting ATSC Identifier and Country code ,Provider code and user identifier properly such 0xB5, 0x0031, "GA94"  etc. But once CC data starts after 0xFF,   thats when i see different data . So in a way i am getting starting 12 bytes of each payload, properly. The remaining Actually CC data is mismatch from my reference . And also i am not getting Time stamp which was parameter  to this function, as it always returns 0 .

     Please let me know if i have done anything wrong or some interpretation problem. Please let me know.
I did not get information in manual, as this blog of forum talked about time stamp hence i am asking relevant question here.Please help me to find answer. I am attaching my code as well for  your reference.

 

Regards

Vinay

 

0 Kudos
Reply