Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK

Inconsistent 'inter_view_flag' from H.264 MVC encode



I am trying to encode an H.264 MVC stream from two 1920x1080 YV12 files using the 'sample_encode.exe' provided in Media SDK 2012 p_3.0.014_GOLD (which I believe is the latest Media SDK release as of this writing). The command line I am using is:

.\\samples\\_bin\\win32\\sample_encode.exe mvc -i view_0.yuv -i view_1.yuv -o test_mvc.264 -w 1920 -h 1080 -u quality

In the encoded bitstream, the first view component (base view) of the first IDR Access Unit has 4 slices. The NAL unit containing the first slice is immediately preceded by a Prefix NAL unit (nal_unit_type equal to 14) in which the 'inter_view_flag' bit in the nal_unit_header_mvc_extension() is coded to 0. But for the NAL units containing the other 3 slices, none of them is preceded by a Prefix NAL unit.

However, according to the 'inter_view_flag' semantics specified in sec. H. of the H.264 specification:

When nal_unit_type is equal to 1 or 5 and the NAL unit is not immediately preceded by a NAL unit with nal_unit_type equal to 14, inter_view_flag shall be inferred to be equal to 1.

The value of inter_view_flag shall be the same for all VCL NAL units of a view component.

the Media SDK encoded MVC stream should be deemed as illegal because its first VCL NAL unit has 'inter_view_flag' coded to 0, while its other VCL NAL units in the same view component (base-view) all have an inferred 'inter_view_flag' of 1, which violates the clause that "The value of inter_view_flag shall be the same for all VCL NAL units of a view component".

Due to such inconsistency in 'inter_view_flag', the encoded stream failed to pass our bitstream validation tool and cannot be correctly decoded by our MVC decoder.

I know this can be worked around by letting the encoder encode only one slice per view component (this can be achieved by specifying '-u balanced' instead of '-u quality' in the sample_encode.exe cmmand line, or by assigning 1 to mfxVideoParam.mfx.NumSlice before encoder init). However, after such workaround the encoder seems unable to make use of all cores of the CPU. In my experiments on a quad-core PC, it appears only one CPU core is being used which makes the encoding process way slower (I am using the SW implementation of the SDK).

I hope Intel will be able to fix this bug in the next SDK release. Meanwhile, is there any better workaround you may suggest? Thank you.


0 Kudos
1 Reply
When target usage is set to quality, software encoder cannot encode macroblocks in parallel because it has to take care of inter-macroblock dependencies. In this mode only slice parallelization is possible and it is done when NumSlice == 0.

The sample_encode encodes MVC streams in simulcast mode. This means that views are separate and no interview dependency is used. As another workaround you could try to modify functionCEncodingPipeline::AllocAndInitMVCSeqDesc() function in pipeline_encode.cpp to specify some real interview dependencies for second view. This should also produce better quality.

Meanwhile I am going to check this problem and try to let you know when it is fixed.
0 Kudos