Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

MFX decoder and random access

stevz
Beginner
1,184 Views

I thought I already managed to get random access into my elementary h264 streams working. But looking more closely I found that this is not fully true :(

Here is what I do:
Analyzea h264 elementary stream file and log random access points. I identify random access points as being IDR or non-IDR slices that contain an I-frame with the first-mb field set to 0. During logging I keep book of random access start positions and sizes along with all header pieces preceeding the slice.

Then, upon decoding with random access, I reset my MFX decoder and feed the whole bite of the bitsream that contains the Group of Pictures which in turn contains the frame I want to decode. With Group of Pictures I refer to all slices succeeding a random access I-frame slice until the next I-frame slice (and its preceeding header data such as AUD, SEI, PPS, and SPS) is found.

The result is that I get the requested frames from the decoder. Extracting some random frames throughout the stream looked ok at first. So I'm able to decode single frames of a range of frames from my stream without error. But now I noticed that some GOPs provide decoded frames with artifacts and I cannot nail down a reason. Neither is the decoding process giving me any kind of errors :(

My test stream looks like this:

0x0000000000000001 NALU_TYPE_AUD (9)

0x0000000000000007 NALU_TYPE_SEQ_PARAM_SET (7)

0x000000000000003d NALU_TYPE_PIC_PARAM_SET (8)

0x0000000000000046 NALU_TYPE_SEI (6)

0x0000000000000057 NALU_TYPE_SEI (6)

0x00000000000000dd NALU_TYPE_SEI (6)

0x00000000000000e9 NALU_TYPE_SLICE_IDR (5)

0x0000000000014f10 NALU_TYPE_SLICE_IDR (5)

frame[0]: slice_type SLICE_TYPE_I_FRAME (7)

... (more slices, AUDs, and SEIs here)

...

0x00000000000e9c92 NALU_TYPE_SLICE_NON_IDR (1)

frame[23]: slice_type SLICE_TYPE_B_FRAME (6)

0x00000000000ea906 NALU_TYPE_AUD (9)

0x00000000000ea90c NALU_TYPE_SEQ_PARAM_SET (7)

0x00000000000ea942 NALU_TYPE_PIC_PARAM_SET (8)

0x00000000000ea94b NALU_TYPE_SEI (6)

0x00000000000ea95c NALU_TYPE_SEI (6)

0x00000000000ea9e2 NALU_TYPE_SEI (6)

0x00000000000ea9ee NALU_TYPE_SEI (6)

0x00000000000ea9f7 NALU_TYPE_SLICE_NON_IDR (1)

0x00000000001051ed NALU_TYPE_SLICE_NON_IDR (1)

frame[24]: slice_type SLICE_TYPE_I_FRAME (7)

... (more slices, AUDs, and SEIs here)

...


The stream (from a video camera) contains I-frames every 24 frames and hence each group of pictures contains 24 frames preceded by SPS and PPS and some SEI nalus.

Now I want to decode just the 25th frame (so that is frame[24]) which is in this case the I-frames starting the 2nd GOP in my stream. To do that I feed the stream from 0x0ea906 AUD to the decoder with a size that ranges to the start of the AUD nalu preceding the following I-frame of the 3rd group some 24 frames later. The decoding does not throw any errors but the resulting frame contains some blocky artifacts. But just some, its about 90 % ok.

Next I tried to additionally decode the 24th frame (so that is frame[23] and frame [24]). To get frame[23] I have to feed the whole 1st GOP to the decoder starting at 0x000001 AUD and ranging to 0x0ea905. Once I got that frame I continue decoding with the whole 2nd GOP again starting from 0x0ea906 AUD and ranging to the end of this GOP. Now the result is ok and I get both frames [23] and [24] without any artifacts.

I already double checked file positions and GOP ranges with a hex editor making sure that my file analysis and log is correct. So I can be quite sure that I feed the correct bitstream bites to the decoder.

At this point I have no idea what could go wrong here. I tried to decode some other frames, especially the first ones of various GOPs inside my stream. It seems that I can only get rid of artifacts for I-frames starting a GOP if I also feed the preceding GOP to the decoder. So if I want to get frame[24] starting the 2nd GOP I have to feed the 1st GOP as well.

Now I think this nails me down to 2 possible problems:

1) My assumption that a slice starting a new I-frame can be used for random access is wrong.
2) The Intel MFX decoder requires some information from the preceding data to actually decode a single I-frame but it does not report any errors and decodes the frame if it does not get the additional information - but with artifacts in this case.

Finally I hope to get some input here as to what is going wrong :(

PS: I also have to add that decoding the whole stream from start to finish (feeding one GOP at a time to the decoder) produces no artifacts at all.

PPS: I'm also quite sure that I use the decoder correctly, that is feeding 0 as bitstream to finish decoding of any internal data, and the like.

PPPS: I'm using the decoder is SOFTWARE mode on Windows 7 64 bit with a Core i7 870.
0 Kudos
1 Solution
Markus_Pingel
New Contributor I
1,184 Views
Hi Stevz,

the stream layout of an open GOP in an AVCHD stream is normally likethis (in stream order!):

SPS/PPS
I-Frame
B-Frame
B-Frame

but the output order is B B I. The decoder reorders the frames according to their pic_order.

Regards,

Markus

View solution in original post

0 Kudos
13 Replies
Markus_Pingel
New Contributor I
1,184 Views

Streams from video cameras normally contain open GOPs (starting with a non-IDR frame). I guess that the B frames after an I frame show artifacts because they actually reference frames in the previous GOP.
So for random access you have to add the previous GOP as "preroll" frames.
Only IDR frames are a point where decoding does not need frames from previous GOPs.

0 Kudos
stevz
Beginner
1,184 Views
Hi Markus,

thanks again for your input. If I would experience artifacts with B-frames I would have guessed the same. But the artifacts already happen at the I-frame starting a new GOP if I do not provide the previous GOP as well.

Could there be I-frames starting a GOP that also require the previous GOP to be decoded without artifacts?

In my sample stream and test case frame[24] is the one that I want to decode and it is an I-frame starting a new GOP. But it does only decode ok if I provide the previous GOP as well. Unless my stream parsing is buggy and frame[24] is not really an I-frame. But from looking at the stream with a hex editor I'm pretty sure that my parsing is ok ... so any chance that the h264 spec can allow I-frames that reference previous GOPs?

regards,
Stevz
0 Kudos
Markus_Pingel
New Contributor I
1,184 Views
Hi Stevz,

did you check that the decoded frame is really the I frame? I think the decoder reorders the frames and delivers the B frames before the I frame. If it is really the I frame then i do not know what the problem might be.

Regards,

Markus
0 Kudos
stevz
Beginner
1,184 Views
Yeah I actually checked the decoded frame to be the exact frame number I wanted it to be. I doubled checked against the complete decode I did with the complete stream using the Intel decoder and I also checked by loading the original mutiplexed file into Adobe After Effects going to the frame in question and comparing it to me decoded result.

But ... looking at the pieces of the stream I posted above I just noticed that there seems to be another slice at 0x0ea9f7 starting the GOP and only the second slice is recogniced as the I-frame. So there might be a B-frame involved here branding the GOP as being open to the previous one.

...
0x00000000000ea906 NALU_TYPE_AUD (9)
0x00000000000ea90c NALU_TYPE_SEQ_PARAM_SET (7)
0x00000000000ea942 NALU_TYPE_PIC_PARAM_SET (8)
0x00000000000ea94b NALU_TYPE_SEI (6)
0x00000000000ea95c NALU_TYPE_SEI (6)
0x00000000000ea9e2 NALU_TYPE_SEI (6)
0x00000000000ea9ee NALU_TYPE_SEI (6)
0x00000000000ea9f7 NALU_TYPE_SLICE_NON_IDR (1)
0x00000000001051ed NALU_TYPE_SLICE_NON_IDR (1)
frame[24]: slice_type SLICE_TYPE_I_FRAME (7)
...

So chances are that I actually do log the slice type wrong and that there is indeed a B-frame starting the GOP. I have to check this in the evening when I'm back in front of my code.

But thanks for pointing out the open GOP keyword. If I would be looking at a B-frame starting the GOP in question it would perfectly make sense so I hope that it's sill this sort of bug in my parsing :)

regards,
Stevz

0 Kudos
stevz
Beginner
1,184 Views
Unfortunately its was just my console output that was a bit misleading. The streams actually looks as I was thinking in the first place, which is like so:

...

0x00000000000e793b NALU_TYPE_AUD (9)

0x00000000000e7941 NALU_TYPE_SEI (6)

0x00000000000e79c7 NALU_TYPE_SEI (6)

0x00000000000e79d3 NALU_TYPE_SLICE_NON_IDR (1)

frame[23]: slice_type SLICE_TYPE_B_FRAME (6)

0x00000000000e854b NALU_TYPE_SLICE_NON_IDR (1)

0x00000000000e93e2 NALU_TYPE_SLICE_NON_IDR (1)

0x00000000000e9c92 NALU_TYPE_SLICE_NON_IDR (1)

0x00000000000ea906 NALU_TYPE_AUD (9)

0x00000000000ea90c NALU_TYPE_SEQ_PARAM_SET (7)

0x00000000000ea942 NALU_TYPE_PIC_PARAM_SET (8)

0x00000000000ea94b NALU_TYPE_SEI (6)

0x00000000000ea95c NALU_TYPE_SEI (6)

0x00000000000ea9e2 NALU_TYPE_SEI (6)

0x00000000000ea9ee NALU_TYPE_SEI (6)

0x00000000000ea9f7 NALU_TYPE_SLICE_NON_IDR (1)

frame[24]: slice_type SLICE_TYPE_I_FRAME (7)

0x00000000001051ed NALU_TYPE_SLICE_NON_IDR (1)

0x0000000000123d91 NALU_TYPE_SLICE_NON_IDR (1)

0x00000000001445ee NALU_TYPE_SLICE_NON_IDR (1)

0x0000000000164a54 NALU_TYPE_AUD (9)

0x0000000000164a5a NALU_TYPE_SEI (6)

0x0000000000164ae0 NALU_TYPE_SEI (6)

0x0000000000164aec NALU_TYPE_SLICE_NON_IDR (1)

frame[25]: slice_type SLICE_TYPE_B_FRAME (6)

0x0000000000165a19 NALU_TYPE_SLICE_NON_IDR (1)

0x0000000000166a51 NALU_TYPE_SLICE_NON_IDR (1)

0x000000000016726e NALU_TYPE_SLICE_NON_IDR (1)
...

The second GOP starts at 0x0ea906 and if I start decoding from there, trying to get just one frame from the decoder, then the decoded image has artifacts. If I supply the whole previous GOP then I can decode frame[24] without problems.

Unfortunately frame[24] is an I-frame so I would not expect the previous GOP to be required.

Hopefully someone can pimp my understanding of h264 streams and the MFX decoder requirements. As of now I cannot understand why the previous GOP should be required for decoding frame[24] :(

After all an I-frame should be able to decode without any other frames, right?

0 Kudos
Anthony_P_Intel
Employee
1,184 Views
Hi,

In your case, I believe the decoder requires the information provided as part of Frame[23].

0x00000000000ea90c NALU_TYPE_SEQ_PARAM_SET (7)
0x00000000000ea942 NALU_TYPE_PIC_PARAM_SET (8)

While the compressed data in an I-Frame does not require access to data form another frame, it does require correct/current information about the current I-Frame to acheive correct decoding.

-Tony

0 Kudos
stevz
Beginner
1,184 Views
Hi Tony

thanks for your reply. But I already do provide this data to the decoder. Actually I copy the data from location 0x0ea906 into a mfxBitstream. The size of the data spans from this location (the AUD nalu followed by the SPS, PPS, SEIs) to the end of the GOP with its 24 frames (frame[24] and following).

I do not consider the nalus from the succeeding AUD to be part of frame[23]. I already consider those to be the "header data" of the second GOP who's first slice is thei-frame[24]. The above is just a paste from my debug output. 0x0ea906 is what I consider to be the start of the second GOP.

Here is what I roughly do with respect to the MFX decoder:

1. initialize it from the start of the elementary stream until decoder::DecodeHeader is satisfied.
2. decoder::Reset
3. decoder::DecodeFrameAsync(bitstream from 0x0ea906 with whole GOP)
4. while (decoder::DecodeFrameAsync(NULL) != ERR_MORE_DATA)
4.1 handle MFX errors
4.2 break upon first sync point

I get no error whatsoever and I get to the first sync point (in this case frame[24]) and can extract the picture. But it contains artifacts. I have attached part of the decoded frame featuring the artifacts. It appears to me that only the areas showing moving objects have artifacts, the static background looks perfectly ok. So this might provide a lead to the problem? The motion prediction seems to go wrong here.
0 Kudos
stevz
Beginner
1,184 Views
Still no solution on my end but some more findings ...

1. Since I reached the point where I do no longer fully trust my file demuxing and parsing I downloaded the demo version of the Elecard Stream Analyzer (http://www.elecard.com/en/products/professional/analysis/stream-analyzer.html)

Now I have my confidence back because this tool confirms all of the NALU types and start positions inside the elementary stream that my own analyzer reports. So I'm finally 100 % sure that I know my stream inside out and do actually feed the decoder what I think that I'm feeding it.

2. The problem still exists and can be generalized as follows:

Each time I want to extract single frames from the start of a GOP (not necessarily the first frame but one of the first slices in there) I suffer artifacts as in the image attched to the previous post.

But each time I send not just the GOP in question but also the previous one to the decoder (skipping all decoded frames inside the previous GOP) I can extract single frames from the GOP in questions without any artifacts.

That leads to the conclusion ...
... that the MFX decoder does actually require information from the pevious GOP even if the GOP I actually want to decode starts with SPS and PPS headers followed by a non-IDR slice with an I-frame.

Without knowing why this is necessary I do not want to modify my random access decoding to fully decode and discard one full GOP preceding the GOP where I want to get a frame from.

Maybe someone has another idea what might be causing the decoder (running in SOFTWARE mode) to behave the way it obviously does ... ?

regards,
Stevz
0 Kudos
stevz
Beginner
1,184 Views
double post
0 Kudos
Markus_Pingel
New Contributor I
1,184 Views
Hi Stevz,

I just tested with an AVCHD stream with a timecode and I can see the artifacts in the two B frames only.
I send an open GOP beginning at the SPS/PPS and the first two frames that the decoder delivers are the B frames and they have artifacts.
The AVCHD spec states that only the pictures in an open GOP preceding the non-IDR I frame in display order reference the previous GOP. So when you decode an open GOP you either discard the frames that reference the previous GOP or you send the previous GOP to the decoder as you currently do.

Regards,

Markus
0 Kudos
stevz
Beginner
1,184 Views
Hi Markus

thanks for spending time on this :)

By open GOP I think you refer to a stream layout such as:

SPS/PPS
B-Frame
B-Frame
I-Frame
...

In such as case I agree that I would need to either skip the leading B-Frames or send the previous GOP to the decoder as well.

But the GOPs I'm dealing with at the moment are as follows:

SPS/PPS
I-Frame
B-Frame
B-Frame
...

Am I right that this does not qualify as open GOP?

Or could it be that the decoder outputs the B-Frames of such a stream before the I-frame?

Won't the MFX decoder not reach the sync points in order of "frames appearance" in the stream? Or could it send me the B-frame before the decoded I-frame ... ? I would hope not.

Thanks,
Stevz
0 Kudos
Markus_Pingel
New Contributor I
1,185 Views
Hi Stevz,

the stream layout of an open GOP in an AVCHD stream is normally likethis (in stream order!):

SPS/PPS
I-Frame
B-Frame
B-Frame

but the output order is B B I. The decoder reorders the frames according to their pic_order.

Regards,

Markus

0 Kudos
stevz
Beginner
1,184 Views
Ok ... that would explain a lot :)

That means my stream sends the I-frame first but actually the B-frames succeeding the I-frame should be displayed before the I-frame?

So my homework for today is taking the pic_order field into account as well. The closer I look the more complex this AVC format becomes.

Thanks again and a lot for your patience with me :)

regards,
Stefan
0 Kudos
Reply