Hello! The Intel Media SDK seems like a promising technology. However, due to its flexible nature it is also not so easy to get into the internals. What I'd like to do is to encode a larger sequence of images (raw rbg values). As far as I can see I need a bitstream for encoding. I assume I have to emulate this from my image sequence. However, I am a bit clueless on how I should do this... or if this is the right way to do it at all. Do I have to convert the raw rgb values to NV12? Would be pointing the Data field from mfxBitstream to my new image array, calling EncodeFrameAsync and reiterating again be a good way to do this? Thanks for your help in advance!
Bitstream is the output from the encoder so it's not related to your input surfaces. See the Media SDK encode sample for details on how setup and use mfxBitstream.
The Media SDK encode and decode components only supports one format natively, NV12. So if your input surfaces are of other type you will have to convert these to NV12 first. VPP has support for some common color conversion routines.
For an example on how to encode RGB surfaces you can also check out one of the samples part of the Media SDK tutorial.
There is an example called "pipeline_encode.cpp" that I used as the basis for my code. All the Intel code uses virtual classes for file input and output, so I just implemented my own version of these. This allows maximum reuse of the example code. The problem I found isn't so much the framework etc, it is getting all the initialisation parameters correct, as I originally worte code that would initialise on the Software Codec, but not on the Hardware codec.. Using the Init function from the example code makes life much easier.
Thanks a lot for the suggestions!
The examples from the tutorial are easier to understand than the examples provided in the SDK. I managed to correctly encode an input file generated from the decode sample. I got to dive into the internals a bit as well now. However, I've got some problems figuring out the NV12 format.<br>
I'm trying to convert my input video (raw rgb24 frames) to NV12. I am aware that VPP/IPP has conversion functionality for this. However, since I need to do quite a lot of processing/converting with my images anyway before pushing them through MSDK I'd like to be able to handle this myself.<br>
According to the YUV specification I calculate the values for NV12 and order them as needed. (Y plane, UV plane interleaved beginning with V) The content in the encoded video is currect, but the coloring is wrong (too much green/cyan). I use the formula Y=0.299*R+0.587*G+0.114*B, U=(B-Y)*0.493+128 and V=(R-Y)*0.877+128. As far as I know setting both U and V to 128 while leaving Y as is should give me the video in greyscale. I tried this to verify that I'm not doing a wrong calculation for U/V. However, this does not seem to change much. (different green/cyan colouring but no greyscale)<br>
Could it be that U/V does not span the whole byte range? (0-255) Any other details that I've missed?
Are you setting up the bitstream correctly, that is using the Y and UV pointers instead of the R,G,B pointers in the bitstream structure ?
Are you using Intel IPP ? It's cheap and fast. There is a function, ippiYCbCr420ToRGB_8u_P2C3R which will do NV12 to RGB, and I know there is an equivalent function for RGB to NV12.
For details about the NV12 color space format: http://www.fourcc.org/yuv.php#NV12
There is also a sample part of the Media SDK tutorial that showcases how to use IPP for RGB24 to NV12 conversion + encode.
Thanks for the comments!
It turned out that I understood the NV12 color space format all right, but I had a bug in my conversion code. I noticed that in sample_encode_3 the input data is not expected to be in NV12 format but in YV12. While reading the raw frame the conversion between YV12->NV12 happens. After looking into the values that were read in at this section I noticed that my input data was not in the format I expected it to be!
Everything works fine now. I will have to assess if I will use IPP for my application or if I do it myself with OpenCL. In any case, the speed of the encoding is quite impressive!