Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.

H264 encoder must use B frames ?

I've built a transcoder (mpeg2/mp3 -> h264/aac) using microsoft direct show filters for the decoding, and the intel sdk direct show filters for the encoder.
The encoded video and audio, are then muxed (using intel mp2_mux filter), and I get a beautiful ts file.

I've tried in both ways: either through C++ (yes, it takes some effort to build), either through graphedit. The flow is : TS file, Microsoft MPEG2 demux => Microsoft Mp3 Audio + Microsoft MPEG2 Video decoders => Intel AAC + Intel H264 encoder => Intel MPEG2 mux => file write.

All is good, and nice. VLC can play the resulted ts file, everythings is ok.

However, Windows Media Player no, unless I use B frames for the h264 encoder (only audio, and black video).
I've tried all sort of configurations (I have some experience with transcoders and video standards), different profiles, slices, levels, etc etc. The only common parameter making the Windows Media Player capable to play the obtained TS file, is to use B frames. Unfortunately, this is out of spec for Baseline.

Also, the FFMPEG TS segmenter complains about the TS when not using B frames.

So, it might be either some problems with timestamps into the encoder, or into the mp2 muxer, or I'm doing something wrong.

I use the latest 1.6 Gold edition of media SDK.

So, the questions are:

Is this a known problem? Are there some magic parameters to give to the h264 encoder direct show filter (with B frames zero) that will make Windows Media Player support it?


0 Kudos
3 Replies
I, P and B frames in MediaSDK h264 encoder are controlled with the following two parameters passed to encoder Init method (from MediaSDK manual):


Number of pictures within the current GOP (Group of Pictures); if GopPicSize=0, then the GOP size is unspecified. If GopPicSize=1, only I-frames are used. See Example 12 for pseudo-code that demonstrates how SDK uses this parameter.


Distance between I- or P- key frames; if it is zero, the GOP structure is unspecified. Note: If GopRefDist = 1, there are no B-frames used. See Example 12 for pseudo-code that demonstrates how SDK uses this parameter.

So for example if you specify GopRefDist = 1 you get only I and P frames. It is always better to specify reasonable values for these two parameters because you never know what the default values are. For example if you want two B frames between P you need to specify GopRefDist = 3.

It's not setting the parameters the problem.
I have no problems switching between all the configurations, number of b frames, key frames, etc

The problem is WHY is the current h264 encoder giving good results ONLY if B frames are used, otherwise the resulted transcoded ts file is NOT playable by Windows Media Center, or Windows Media Player, etc, but only VLC.

Is this a known problem?
In other words, have you guys managed to play correctly under Windows Media Player, a ts file (h264/aac) obtained with with Intel AAC Encoder + H264 Encoder + MPEG2 Muxer, WITHOUT using B frames?

If yes, can you let me know what are the configurations (parameters) that were used to do so?

Thank you.

Hello nills78

No I don't think it is known problem. Our testing includes many different scenarios but it is never possible to test all of them.

In order to try to reproduce your problem we need all the details that you can give us. What parameters do you use for encoding? How do you mux the resulting stream? What filters do you use for playback in windows media player (you mentioned FFMPEG TS segmenter)? What version of media player do you use? Try to describe in all possible details every step that you do.