I have been upgrading an H264 encoder to support Intel QuickSync Video by using the encoder and video conference samples as tutorials. I think I've got everything working correctly, initialisation, allocation and encoding frames without any errors or warnings. However when trying to extract the H264 CIF encoded bitstream, it doesn't appear to be in a linear format, it's split into blocks of 384 bytes, some data and then zero padded. I've run the Intel sample encoder and it outputs the bitstream as I would expect, in a linear format, with headers followed by frame NALs.
I've tried replicating all the initialisation and set up from the tutorials but I can't seem to find whats causing the problem. I've attached a data file of the raw H264 elementary stream that is output for the first 5 encoded frames. Viewing in a hex editor should show what I mean about data blocks of 384 bytes. The size of the data blocks changes based on encoded picture size, for example it is 256 bytes for QCIF.
Does anybody have any idea whats going on?
I initialise the encoder as follows:
memset(&mfxVideoParam, 0, sizeof(mfxVideoParam))
mfxVideoParam.AsyncDepth = 1
mfxVideoParam.IOPattern = MFX_IOPATTERN_IN_VIDEO_MEMORY
mfxVideoParam.mfx.CodecId = MFX_CODEC_AVC
mfxVideoParam.mfx.CodecProfile = MFX_PROFILE_AVC_BASELINE
mfxVideoParam.mfx.TargetUsage = MFX_TARGETUSAGE_BALANCED
mfxVideoParam.mfx.GopPicSize = 30
mfxVideoParam.mfx.GopOptFlag = MFX_GOP_STRICT
mfxVideoParam.mfx.GopRefDist = 1
mfxVideoParam.mfx.RateControlMethod = MFX_RATECONTROL_CBR
mfxVideoParam.mfx.TargetKbps = 384
mfxVideoParam.mfx.NumSlice = 1
mfxVideoParam.mfx.NumRefFrame = 1
mfxVideoParam.mfx.FrameInfo.FourCC = MFX_FOURCC_NV12
mfxVideoParam.mfx.FrameInfo.Width = 352
mfxVideoParam.mfx.FrameInfo.Height = 288
mfxVideoParam.mfx.FrameInfo.CropW = 352
mfxVideoParam.mfx.FrameInfo.CropH = 288
mfxVideoParam.mfx.FrameInfo.FrameRateExtN = 30
mfxVideoParam.mfx.FrameInfo.FrameRateExtD = 1
mfxVideoParam.mfx.FrameInfo.PicStruct = MFX_PICSTRUCT_PROGRESSIVE
mfxVideoParam.mfx.FrameInfo.ChromaFormat = MFX_CHROMAFORMAT_YUV420
mfxExtCodingOption.MaxDecFrameBuffering = 1
I've also tried many different configuration of the above to no effect.
The problem is observed with both MediaSDK 2013 and 2013 R2.
Any help would be greatly appreciated.
I would double check your usage of the following members of mfxBitstream:
Pop open the docs and/or the sample_encode project and make sure you area using them correctly.
Hope that helps.
I've been through the mfxBitstream in the docs and the encoder sample and I can't see anything wrong.
The Bitstream Data pointer is 32 byte aligned and I get its size using the video out parameters from GetVideoParam, recently I even multiply it by 4 for good measure.
Here is a frame encode sequence (CIF,30fps,384k) with output parameters displayed indented:
pQSV->EncBitstream.DataLength = 0;
memset(pQSV->EncBitstream.Data, 0, pQSV->EncBitstream.MaxLength);
sts = MFXVideoENCODE_EncodeFrameAsync(pQSV->Session, NULL, &pQSV->EncFrameSurface, &pQSV->EncBitstream, &EncSync);
EncodeFrameAsync (sts 0): Sync 0x201 BitStream Data 0x02D1D5C0 Offset 0 Length 0 MaxLength 296000 TimeStamp 0
sts = MFXVideoCORE_SyncOperation(pQSV->Session, EncSync, 30);
SyncOperation (sts 0): Sync 0x201 BitStream Data 0x02D1D5C0 Offset 0 Length 8694 MaxLength 296000 TimeStamp 4291285181
Bitstream: PicStruct 0x1 FrameType 0xC1
bytesWritten = fwrite(pQSV->EncBitstream.Data + pQSV->EncBitstream.DataOffset, 1, pQSV->EncBitstream.DataLength, pFileOut);
This snipet of code writes out the data that can be seen in the bitstream attachment of my first post. I've still no idea why I'm getting the bitstream output in blocks of 384 bytes with zero padding rather than a linear bitstream of NALs.
What does your file open directive look like? Maybe you are using something like fopen("file", "w") (text mode) instead of fopen("file", "wb") (binary write mode), thus the issue.
To open a file to write the bitstream I use:
pFileOut = fopen("qsv-cif-384.264","wb");
This shouldn't be the problem as I use the code throughout all my encoders to write out bitstreams to supply them to other decoders for testing and debugging purposes.
When I first saw the bitstream output format, I tried to reconstruct the H264 bitstream from the blocks of 384 bytes which seem to be split into segments of 16 bytes. I managed to reconstruct the sequence/picture headers and a little into the first Intra picture but after that I gave up as I couldn't identify the output pattern. With further research I found that everyone else (including the sample encode) appeared to be getting linear NALs which is what I expected it to be like in the first place.
I was thinking that it was something to do with the internal workings of the hardware encoder, splitting the picture into mulitple segments and outputting the bitstream in 16 byte data chucks into blocks of 384 bytes until they were full, then more 384 byte blocks would be added if excessive bits were generated. The output block sizes seemed to be related to the picture size, for CIF the blocks are always 384 bytes but for QCIF they are 256 bytes. Increasing the bitrate, increases the amount of actual data within the blocks (less zero padding) and the number of total blocks in the bitstream output.
I am going through the sample encode to try and identify anything that I am doing differently, rather unsuccessfully at the moment. I can only imagine that it must be something in the initialisation that has put the encoder into some weird output mode.
Your issue is certainly quite odd. At the moment I have no clue what may be going on.
Just as a sanity check, can you also test if the output format is the same if you use the Media SDK SW encoder?
Please let us know what you find.
I've now implemented the media SDK for software encode too and it outputs NALs correctly in a linear format. If I switch back to hardware, I still get the mulitple blocks of 384 bytes with data and zero padding.
It feels like it is so close to working properly but I just can't seem to find whats wrong.
Does anybody at Intel have any idea why I'm getting the weird output from the hardware?
Even though the software output works, it's not going to be fast enough, I've got to get the hardware working.
I've been stepping through the encoder sample replicating every step in my code to make sure I'm setting up D3D9 and the mfx structures the same way. I've had no luck yet at getting the correct output.
I'm hoping for some insight into how the internal hardware encoder works so that I can focus on an area that might solve this problem.
Based on the info you provided I cannot think of any reason why you're encountering these issues. Especially since when you're using the Intel encode sample code it all work fine. Right? It seems the issue is somehow connected with either your output buffer handling or the way you write the encoded bitstream blocks to file system.
We have not seen any issues similar to this before.
Would it be possible for you to share your code so that we can try to reproduce the issue here on our side?
My QSV code outputs bitstreams correctly if I set it to Software implementation, but it does not output correctly when it's set to Hardware implementation. The majority of the code is the same for both implementations, it's only the directX surfaces that is set up in Hardware and not in set up when using Software.
Running the Intel sample_encode code outputs the bitstream correctly with both Software and Hardware implementation. Though I haven't found what the difference is between the Intel hardware set up and my hardware set up.
I'm willing to send you my QSV code, however I'd rather not post it on this forum, is there a way to get it to you?
This forum supports sending private messages. Please create a new private post, then we can continue the discussion on how to best share the code.
I've managed to identify the problem with the H264 (AVC) encoder bitstream output being in blocks of 384 bytes with zero padding. I had been focusing on finding a problem with either the allocation of my picture input buffer or my bitstream output buffer, however it was neither of these. The problem was occuring within the MFXVideoENCODE_Init(). This function calls the FrameAllocator() function a number of times to allocate buffers for itself. However I'd assumed (as I'd found no reference to any other buffer type) that the FrameAllocator would only allocate NV12 buffers, but that is not the case, it also allocates a D3DFMT_P8 buffer. I can't say for sure as I don't know the inner workings of the hardware but this D3DFMT_P8 buffer may be where the encoder constructs the bitstream in video memory. Once the frame has been encoded it then copies it into the users allocated mfxBitstream.Data buffer. Hence if it is set up to be any other format than D3DFMT_P8 then you will get a very strange output bitstream.
I have updated my frame allocator to support D3DFMT_P8 buffers and the bitstream output format from the hardware encoder is now correctly formatted.
Thanks to everyone that helped,