Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

encode image sequence

jwregme
Beginner
382 Views

Hello! The Intel Media SDK seems like a promising technology. However, due to its flexible nature it is also not so easy to get into the internals. What I'd like to do is to encode a larger sequence of images (raw rbg values). As far as I can see I need a bitstream for encoding. I assume I have to emulate this from my image sequence. However, I am a bit clueless on how I should do this... or if this is the right way to do it at all. Do I have to convert the raw rgb values to NV12? Would be pointing the Data field from mfxBitstream to my new image array, calling EncodeFrameAsync and reiterating again be a good way to do this? Thanks for your help in advance!

0 Kudos
6 Replies
Petter_L_Intel
Employee
382 Views

Hi,

Bitstream is the output from the encoder so it's not related to your input surfaces. See the Media SDK encode sample for details on how setup and use mfxBitstream.

The Media SDK encode and decode components only supports one format natively, NV12. So if your input surfaces are of other type you will have to convert these to NV12 first. VPP has support for some common color conversion routines.

For an example on how to encode RGB surfaces you can also check out one of the samples part of the Media SDK tutorial.
http://software.intel.com/en-us/articles/intel-media-sdk-tutorial
http://software.intel.com/en-us/articles/intel-media-sdk-tutorial-simple-6-encode-d3d-vpp-preproc 

Regards,
Petter 

0 Kudos
andy4us
Beginner
382 Views

There is an example called "pipeline_encode.cpp" that I used as the basis for my code. All the Intel code uses virtual classes for file input and output, so I just implemented my own version of these. This allows maximum reuse of the example code. The problem I found isn't so much the framework etc, it is getting all the initialisation parameters correct, as I originally worte code that would initialise on the Software Codec, but not on the Hardware codec.. Using the Init function from the example code makes life much easier.

0 Kudos
jwregme
Beginner
382 Views

Thanks a lot for the suggestions!
The examples from the tutorial are easier to understand than the examples provided in the SDK. I managed to correctly encode an input file generated from the decode sample. I got to dive into the internals a bit as well now. However, I've got some problems figuring out the NV12 format.<br>

I'm trying to convert my input video (raw rgb24 frames) to NV12. I am aware that VPP/IPP has conversion functionality for this. However, since I need to do quite a lot of processing/converting with my images anyway before pushing them through MSDK I'd like to be able to handle this myself.<br>

According to the YUV specification I calculate the values for NV12 and order them as needed. (Y plane, UV plane interleaved beginning with V) The content in the encoded video is currect, but the coloring is wrong (too much green/cyan). I use the formula Y=0.299*R+0.587*G+0.114*B,  U=(B-Y)*0.493+128 and V=(R-Y)*0.877+128. As far as I know setting both U and V to 128 while leaving Y as is should give me the video in greyscale. I tried this to verify that I'm not doing a wrong calculation for U/V. However, this does not seem to change much. (different green/cyan colouring but no greyscale)<br>

Could it be that U/V does not span the whole byte range? (0-255) Any other details that I've missed?

0 Kudos
andy4us
Beginner
382 Views

Are you setting up the bitstream correctly, that is using the Y and UV pointers instead of the R,G,B pointers in the bitstream structure ?

Are you using Intel IPP ? It's cheap and fast. There is a function, ippiYCbCr420ToRGB_8u_P2C3R  which will do NV12 to RGB, and I know there is an equivalent function for RGB to NV12.

0 Kudos
Petter_L_Intel
Employee
382 Views

Hi,

For details about the NV12 color space format: http://www.fourcc.org/yuv.php#NV12

There is also a sample part of the Media SDK tutorial that showcases how to use IPP for RGB24 to NV12 conversion + encode.
http://software.intel.com/en-us/articles/intel-media-sdk-tutorial-simple-6-encode-ipp-cc 

Regards,
Petter 

0 Kudos
jwregme
Beginner
382 Views

Thanks for the comments!

It turned out that I understood the NV12 color space format all right, but I had a bug in my conversion code. I noticed that in sample_encode_3 the input data is not expected to be in NV12 format but in YV12. While reading the raw frame the conversion between YV12->NV12 happens. After looking into the values that were read in at this section I noticed that my input data was not in the format I expected it to be!

Everything works fine now. I will have to assess if I will use IPP for my application or if I do it myself with OpenCL. In any case, the speed of the encoding is quite impressive!

0 Kudos
Reply