Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Encoding directly from video memory

Jack_Chimasera
Beginner
693 Views

Hello

My current project requires me to compose images in GPU memory, and then encode them using QuickSync. I was wondering if, using MFX_IOPATTERN_IN_VIDEO_MEMORY i can avoid copying the images to system memory before encoding. if so, what command shall I use to copy the rendertarget I've composed into to the encoder's surface ? Will StretchRect do it ?

regards

Jack Chimasera

0 Kudos
8 Replies
Petter_L_Intel
Employee
693 Views
Hi Jack, that approach should work just fine. Let us know if you encounter any issues. Regards, Petter
0 Kudos
Jack_Chimasera
Beginner
693 Views
Hello peter I will try this approach. Will StretchRect from an RGB32 rendertarget into QSV's NV12 surface perform the necessary colour conversion ? regards Jack
0 Kudos
Jack_Chimasera
Beginner
693 Views
Hello Peter Another question, if it's possible : Above I have assumed that the GPU will handle the RGB->YUV conversion necessary before the compression. As my application requires very high accuracy, is it possible to tell me what the conversion formula used is (I.E. what are the exact multipliers used by the GPU), and how exactly the 4:4:4->4:2:0 conversion is handled (I.E. where the particular U,V values are sampled from, within the 2x2 pixel range) ? regards Jack
0 Kudos
celli4
New Contributor I
693 Views
Jack, I may be able to offer some helpful input here. 1. If you are composing the images/surfaces in GPU memory you may want to consider encoding straight from a Direct 3D surface. see: "C:\Program Files\Intel\Media SDK 2012 R3\samples\sample_encode\readme-encode.rtf" This may add a lot of work and complexity though, so it all depends on your performance needs, and how much effort you can expend to get it working right. 2. 'sample_encode' from Intel only accepts NV12, and YUV420 video. If you are composing 444 RGB impages, they will need to be converted before calling Encodeframe. Some options here are: - use the VPP module as part of a pipeline to do RGBA->NV12 [I think it will work great on D3D surfaces] (see sample_vpp) - call a routine from the Intel Performance Primitives lib before using a pipeline without VPP color conversion [best on system memory only] - do the color conversion by hand using your own RGBA->NV12 converter (this may be really slow accessing GPU memory) [best for system memory only] - use IDirect3DDevice9::StretchRect to do color conversion. [I do not think this will work for you, as it appears YUV->RGBA only, not RGB->YUV, and without software emulation] I hope this is helpful, please let us know what you get working. Regards, Cameron Elliott
0 Kudos
Jack_Chimasera
Beginner
693 Views
Hello Cameron Thank you for your input. My most urgent need right now is for high performance, namely, I need to encode 60fps of 1920x1080 material, and if possible, two streams of that form in parallel on a modern IvyBridge CPU. I am prepared to put a great deal of effort into making this work properly. A previous implementation which reads back every frame to the CPU, the uses IPP to convert it to NV12 avhieved under 50% of the minimum necessary performance, on account of a slow readback by the GetRenderTargetData method of Direct3D9. I have read "sample encode"s documentation and source, but its plainly visible there that the GPU-bound surfaces are being filled by locking them, and filling them from the CPU, while I need to fill them with data from a surface on the GPU as you have understood. Regarding the options you have suggested : VPP : I will check if VPP can perform RGBA->NV12 without CPU intervention. If it can, this just may be my ticket. IPP : As mentioned above, IPP requires having the RGB32 frames in CPU accessible space, which is far too costly. The same goes for writing my own RGB32->NV12 routine. StretchRect : I was unsure if this method supports RGB32->NV12 on intel's GPUs, which I why I have posted the query to begin with. Thank you for the effort, I will check VPP. regards Jack Chimasera
0 Kudos
Jack_Chimasera
Beginner
693 Views
VPP indeed appears to be the answer for RGB32 -> NV12 conversion. I just hope its input frames can be allocated in a way that will allow me to render into them using Direct3D.
0 Kudos
Petter_L_Intel
Employee
693 Views
Hi Jack, not sure if you were able to progress on this topic. As you concluded Media SDK VPP is likely a good approach to take care of the RGB32->NV12 color conversion. From your descriptions I now understand your environment better. In fact StretchRect is quite limited for DX9 and may not work for the purpose you need. We have found that for some situations a CPU assisted GPU->GPU surface copy may be required. Locking the surface and performing brute force row by row copy of an RGB surface can be a very large bottleneck. In that case please explore using efficient "fast copy" method such as described here: http://software.intel.com/en-us/articles/copying-accelerated-video-decode-frame-buffers Using such approach you will achieve much better performance vs. the brute force approach. Regards, Petter
0 Kudos
Jack_Chimasera
Beginner
693 Views
Thank you, Petter ! I've used the code from the article you've attached (after changing the buffer size from 4K to 8K, due to a very bit render-target size at RGB32), and now I finally managed to do 1920x1080x60. I still plan to try the VPP path, but it's a good deal less urgent now !
0 Kudos
Reply