Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

NAL unit types produced by the H264 encoder

Rūdolfs_B_
Beginner
3,005 Views

Hi,

I'm trying to use the Intel Media SDK to create a H264 stream and pack it into RTP and playback with gstreamer, so far so good, and a lot of answers on this forum have helped me, currently I'm running into issues with understanding if the bitstream is valid in terms of the NAL units present and I need some clearance on this topic.

I have the following questions:

1) The Annex B format is the only bitstream format produced by Intel Media SDK right? It is just a little inconvenient to run through the bitstream looking for 0x00 0x00 0x00 0x01.

2) How to disable NAL's with type 9 (access unit delimiter unit)? As far as I understand I don't need this in the RTP stream, at least I haven't seen real life H264 over RTP streams sending them. I tried passing mfxExtCodingOption structure like this:

mfxExtCodingOption EncodingOptions;
memset(&EncodingOptions, 0, sizeof(EncodingOptions))
EncodingOptions.Header.BufferId = MFX_EXTBUFF_CODING_OPTION;
EncodingOptions.Header.BufferSz = sizeof(EncodingOptions);
EncodingOptions.AUDelimiter = MFX_CODINGOPTION_OFF;

but the encoder initialization always returns that this is unsupported.

3) During encdoing i get onl sps pps and sei units, is that normal? Shouldn't i get some coded slice frames? I've done a little a output from my program that list the frame flags and found NAL units in the bitstream. Can you tell me if this is expected behaivor? If so maybe there is an issue in gstreamer, because it seems to be throwing a lot of errors.

[6.5.2014 23:2:52.319]: Bitstream generated, frame type: MFX_FRAMETYPE_I|MFX_FRAMETYPE_REF|MFX_FRAMETYPE_IDR
[6.5.2014 23:2:52.320]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:52.322]: NAL unit with, type: 7, size: 39
[6.5.2014 23:2:52.323]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:52.324]: NAL unit with, type: 6, size: 3926
...
[6.5.2014 23:2:52.443]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:52.446]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:52.449]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:52.451]: NAL unit with, type: 6, size: 23
...
[6.5.2014 23:2:52.913]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:52.916]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:52.919]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:52.925]: NAL unit with, type: 6, size: 303
...
[6.5.2014 23:2:52.976]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:52.980]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:52.983]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:52.986]: NAL unit with, type: 6, size: 122
...
[6.5.2014 23:2:53.4]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:53.6]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:53.9]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:53.10]: NAL unit with, type: 6, size: 94
...
[6.5.2014 23:2:53.48]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:53.48]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:53.50]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:53.52]: NAL unit with, type: 6, size: 118
...
[6.5.2014 23:2:53.90]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:53.92]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:53.93]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:53.95]: NAL unit with, type: 6, size: 102
...
[6.5.2014 23:2:53.152]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:53.158]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:53.162]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:53.165]: NAL unit with, type: 6, size: 144
...
[6.5.2014 23:2:53.187]: Bitstream generated, frame type: MFX_FRAMETYPE_P|MFX_FRAMETYPE_REF
[6.5.2014 23:2:53.191]: NAL unit with, type: 9, size: 2
[6.5.2014 23:2:53.195]: NAL unit with, type: 8, size: 4
[6.5.2014 23:2:53.198]: NAL unit with, type: 6, size: 101

 

0 Kudos
1 Solution
OTorg
New Contributor III
3,005 Views
0 Kudos
7 Replies
Rūdolfs_B_
Beginner
3,005 Views

Now that I look at the memory of the bitstream I also see values like 00 00 01 25 in the memory, does the encoder use both 00 00 01 and 00 00 00 01 as the prefixes for the NAL units?

0 Kudos
Rūdolfs_B_
Beginner
3,005 Views

Just to clarify, so far I was looking only for the prefix 00 00 00 01, not 00 00 01. Looking at how Google Chrome creates the Annex B bitstream from H264 (http://src.chromium.org/svn/branches/1312/src/media/filters/h264_to_annex_b_bitstream_converter.cc) I understand that the PPS/SPS/SEI/AUD always have an extra zero byte, so I should bee looking for both prefixes right?

0 Kudos
OTorg
New Contributor III
3,006 Views

Look h264 standard: http://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-H.264-201304-S!!PDF-E&type=items

You can find almost all answers there.

 

0 Kudos
Rūdolfs_B_
Beginner
3,005 Views

Yeah thanks, I found everything I needed in the spec. Just to clarify to first question - are there any plans to support other bitstream formats than Annex B?

0 Kudos
Jeffrey_M_Intel1
Employee
3,005 Views

Media SDK h264 encode is annex B only.  No plans to support other formats.

 

0 Kudos
Rūdolfs_B_
Beginner
3,005 Views

Jeffrey Mcallister (Intel) wrote:

Media SDK h264 encode is annex B only.  No plans to support other formats.

 

Thanks for the reply. I was asking this only since at least in my environment the encoder puts SPS, PPS together with I and P frames so when performing RTP packetization I need to run through the buffer looking for start codes so split the bitstream into NAL units. Is there any way to know (that I am not aware of) how to know how many NAL units are put in a given bitstream returned in an mfxBitstream? 

0 Kudos
Kz_Liao
Beginner
3,005 Views

It looks we are doing the similar work! Recently I'm analyzing the bit stream as well.

1) As described in standard ISO/IEC 14496-10, we should look for the start code of NALU by checking if NextBits(24) equals to 0x000001. So I'm doing this by first checking if current byte equals 0x00. It saves some time.

2) In my experiment, if the CPU support encoding hardware acceleration, the delimiter nalu (type == 9) would not present. They were only found when I use the software encoding method.

3) When scanning the encoded stream output by the video conferencing sample, I got the following result. Note that I enabled SVC temporal scale feature.

Processing started
Frame    0, type=I, latency=16.52 ms, parse= 0.12 ms, length= 17482 B, nal[0](0,Delim), nal[1](1,SPS), nal[2](1,PPS), nal[3](0,SEI), nal[4](1,SVCPre){prid=0, tid=0}, nal[5](1,I)
Frame number: 1

Frame    1, type=P, latency=10.92 ms, parse= 0.03 ms, length=  4645 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](0,SVCPre){prid=3, tid=3}, nal[3](0,P)
Frame    2, type=P, latency=15.59 ms, parse= 0.04 ms, length=  6004 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](1,SVCPre){prid=2, tid=2}, nal[3](1,P)
Frame    3, type=P, latency=14.24 ms, parse= 0.03 ms, length=  4954 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](0,SVCPre){prid=3, tid=3}, nal[3](0,P)
Frame    4, type=P, latency=15.87 ms, parse= 0.04 ms, length=  6900 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](1,SVCPre){prid=1, tid=1}, nal[3](1,P)
Frame    5, type=P, latency=14.64 ms, parse= 0.03 ms, length=  4055 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](0,SVCPre){prid=3, tid=3}, nal[3](0,P)
Frame    6, type=P, latency=13.72 ms, parse= 0.03 ms, length=  4120 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](1,SVCPre){prid=2, tid=2}, nal[3](1,P)
Frame    7, type=P, latency=13.98 ms, parse= 0.03 ms, length=  3715 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](0,SVCPre){prid=3, tid=3}, nal[3](0,P)
Frame    8, type=P, latency=12.87 ms, parse= 0.04 ms, length=  6571 B, nal[0](0,Delim), nal[1](0,SEI), nal[2](1,SVCPre){prid=0, tid=0}, nal[3](1,P)
......

This is a piece of result output by the sw encoding process. I found that the last nalu (nal[5] in I frame and nal[3] in P frame) is the coded slice. And if run the demo on PC support hw, the delimiter nalu (nal[0]) won't present.

0 Kudos
Reply