- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi guys,
I am working on Hardware decoding, the decoder gets initialized fine with MFX_IMPL_HARDWARE2|MFX_IMPL_VIA_D3D9 and i am using vpp for the colourspace conversion NV12->RGB4. My system configuration is i7-2600k and I am using 2013 R2 SDK. Everything is working fine except for the CPU usage which is about 32% throughout the decoding process and i am decoding @25fps 1920*1080.
Here is how i am setting up the VPP structure
m_VPPParams.vpp.In.FourCC = MFX_FOURCC_NV12;
m_VPPParams.vpp.In.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
m_VPPParams.vpp.In.CropX = 0;
m_VPPParams.vpp.In.CropY = 0;
m_VPPParams.vpp.In.CropW = 1920;
m_VPPParams.vpp.In.CropH = 1080;
m_VPPParams.vpp.In.PicStruct = MFX_PICSTRUCT_FIELD_TFF;
m_VPPParams.vpp.In.FrameRateExtN = 240000;
m_VPPParams.vpp.In.FrameRateExtD = 9600;
// width must be a multiple of 16
// height must be a multiple of 16 in case of frame picture and a multiple of 32 in case of field picture
m_VPPParams.vpp.In.Width = MSDK_ALIGN16(m_VPPParams.vpp.In.CropW);
m_VPPParams.vpp.In.Height =
(MFX_PICSTRUCT_PROGRESSIVE == m_VPPParams.vpp.In.PicStruct) ?
MSDK_ALIGN16(m_VPPParams.vpp.In.CropH) :
MSDK_ALIGN32(m_VPPParams.vpp.In.CropH);
// Output data
m_VPPParams.vpp.Out.FourCC = MFX_FOURCC_RGB4;
m_VPPParams.vpp.Out.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
m_VPPParams.vpp.Out.CropX = 0;
m_VPPParams.vpp.Out.CropY = 0;
m_VPPParams.vpp.Out.CropW = m_VPPParams.vpp.In.CropW ;
m_VPPParams.vpp.Out.CropH = m_VPPParams.vpp.In.CropH ;
m_VPPParams.vpp.Out.PicStruct = MFX_PICSTRUCT_FIELD_TFF;
m_VPPParams.vpp.Out.FrameRateExtN = 240000;
m_VPPParams.vpp.Out.FrameRateExtD = 9600;
m_VPPParams.vpp.Out.Width = MSDK_ALIGN16(m_VPPParams.vpp.Out.CropW);
m_VPPParams.vpp.Out.Height =(MFX_PICSTRUCT_PROGRESSIVE == m_VPPParams.vpp.Out.PicStruct) ?
MSDK_ALIGN16(m_VPPParams.vpp.Out.CropH) :
MSDK_ALIGN32(m_VPPParams.vpp.Out.CropH);
m_VPPParams.IOPattern = MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
m_VPPParams.AsyncDepth = 1;
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ankush,
You can use video memory surfaces instead of system memory surfaces, which avoids copying of images to system memory
VPPParams.IOPattern = MFX_IOPATTERN_IN_VIDEO_MEMORY | MFX_IOPATTERN_OUT_VIDEO_MEMORY;
This can save on CPU Usage. Let us know if that works. Also can you let us know the driver version of your system as well, if you still encounter the issue?
Thanks,
-Surbhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ankush,
That's a strange behavior that RunFrameVPPAsync() function will return NV12 with video memory. The behavior should be same irrespective of memory used. To investigate more into this I need the params, the code and the input you are using. If you can replicate the behavior with existing tutorial simple_6_decode_vpp_postproc (which seems to be the match with your pipeline), that would help us to replicate the issue.
QueryIOSurf() for the VPP Request it returns MFX_WRN_PARTIAL_ACCELERATION irrespective of the memory I use
This issue shouldn't be dependent upon the memory used, it depend upon youtr HW capabilities that if it support HW or will fall back to SW. I tried to reproduce this issue on my system, but didn't encounter this warning. It would be a good to try the same code on another system and see if you see the same problem. Also if you can send your HW capabilities by using the Media SDK sys analyzer(details over here), probably i can dig in and find if there is any limitation to the system you are using.
if i try
m_pmfxDEC->Query(&m_VPPParams, &m_VPPParams);
it returns MFX_ERR_UNSUPPORTED.
You are using VPP params to query for the number of surfaces required for decode. Right params should be mfxDEC.QueryIOSurf(&mfxVideoParams, &DecRequest); You can find more details again in this existing tutorial simple_6_decode_vpp_postproc (can be download from here).
-Surbhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my application, I am splitting MTS file using FFMpeg and then video stream is decoding using IQSV.
When I decode single MTS file, its decode properly but CPU utilization is goes upto 25%.
Also when I start second instance of IQSV decoder, all the init function return MFX_WRN_PARTIAL_ACCELERATION.
Please find my sample code ,my system analyzer. and trace log.
Please find my code
CSmplBitstreamReader* m_FileReader;
std::shared_ptr<FFMPEGReader> m_ffmpegFR;
mfxU32 m_nFrameIndex; // index of processed frame
mfxBitstream m_mfxBS; // contains encoded data
MFXVideoSession m_mfxSession;
MFXVideoDECODE* m_pmfxDEC;
mfxFrameSurface1** m_pmfxSurfaces; // frames array
mfxFrameSurface1** m_pVppSurfaces; // frames array for vpp input
mfxFrameAllocResponse m_mfxResponse; // memory allocation response for decoder
mfxFrameAllocResponse m_VppResponse; // memory allocation response for vpp
mfxU8* m_surfaceBuffers;
mfxVideoParam m_mfxVideoParams;
mfxVideoParam m_VPPParams;
mfxStatus Init()
{
MSDK_CHECK_POINTER(pParams, MFX_ERR_NULL_PTR);
m_nAsyncDepth = 1;
// =========== ffmpeg splitter ============
MSDK_CHECK_POINTER(m_ffmpegFR, MFX_ERR_MEMORY_ALLOC);
m_FileReader = dynamic_cast<CSmplBitstreamReader*>(m_ffmpegFR.get());
sts = m_ffmpegFR->Init(pParams->strSrcFile, pParams->videoType);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
m_width = m_ffmpegFR->m_pFormatCtx->streams[m_ffmpegFR->m_videoStreamIdx]->codec->width;
m_height = m_ffmpegFR->m_pFormatCtx->streams[m_ffmpegFR->m_videoStreamIdx]->codec->height;
// =========== ffmpeg splitter ============
// API version
mfxVersion version;
sts = DetermineMinimumRequiredVersion(*pParams, version);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
mfxIMPL impl = MFX_IMPL_HARDWARE_ANY;
mfxVersion ver = {0, 1};
sts = m_mfxSession.Init(impl, &ver);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
// Create Media SDK decoder
m_pmfxDEC = new MFXVideoDECODE(m_mfxSession);
// Create Media SDK VPP component
m_mfxVPP = new MFXVideoVPP(m_mfxSession);
memset(&m_mfxVideoParams, 0, sizeof(m_mfxVideoParams));
m_mfxVideoParams.mfx.CodecId = MFX_CODEC_AVC;
m_mfxVideoParams.IOPattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
m_mfxVideoParams.AsyncDepth = m_nAsyncDepth;
// Prepare Media SDK bit stream buffer
// - Arbitrary buffer size for this example
memset(&m_mfxBS, 0, sizeof(m_mfxBS));
m_mfxBS.MaxLength = 1024 * 1024;
m_mfxBS.Data = new mfxU8[m_mfxBS.MaxLength];
MSDK_CHECK_POINTER(m_mfxBS.Data, MFX_ERR_MEMORY_ALLOC);
// try to find a sequence header in the stream
// if header is not found this function exits with error (e.g. if device was lost and there's no header in the remaining stream)
for(;;)
{
// trying to find PicStruct information in AVI headers
if ( m_mfxVideoParams.mfx.CodecId == MFX_CODEC_JPEG )
MJPEG_AVI_ParsePicStruct(&m_mfxBS);
// parse bit stream and fill mfx params
sts = m_pmfxDEC->DecodeHeader(&m_mfxBS, &m_mfxVideoParams);
if (MFX_ERR_MORE_DATA == sts)
{
if (m_mfxBS.MaxLength == m_mfxBS.DataLength)
{
sts = ExtendMfxBitstream(&m_mfxBS, m_mfxBS.MaxLength * 2);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
}
// read a portion of data
sts = m_FileReader->ReadNextFrame(&m_mfxBS);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
continue;
}
else
{
// if input is interlaced JPEG stream
if ( m_mfxBS.PicStruct == MFX_PICSTRUCT_FIELD_TFF || m_mfxBS.PicStruct == MFX_PICSTRUCT_FIELD_BFF)
{
m_mfxVideoParams.mfx.FrameInfo.CropH *= 2;
m_mfxVideoParams.mfx.FrameInfo.Height = MSDK_ALIGN16(m_mfxVideoParams.mfx.FrameInfo.CropH);
m_mfxVideoParams.mfx.FrameInfo.PicStruct = m_mfxBS.PicStruct;
}
break;
}
}
MSDK_IGNORE_MFX_STS(sts, MFX_WRN_PARTIAL_ACCELERATION);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
// Initialize VPP parameters
m_VPPParams.vpp.In.FourCC = MFX_FOURCC_NV12;
m_VPPParams.vpp.In.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
m_VPPParams.vpp.In.CropX = 0;
m_VPPParams.vpp.In.CropY = 0;
m_VPPParams.vpp.In.CropW = m_mfxVideoParams.mfx.FrameInfo.CropW;
m_VPPParams.vpp.In.CropH = m_mfxVideoParams.mfx.FrameInfo.CropH;
m_VPPParams.vpp.In.PicStruct = /*MFX_PICSTRUCT_FIELD_TFF*/MFX_PICSTRUCT_PROGRESSIVE;
m_VPPParams.vpp.In.FrameRateExtN = 25;
m_VPPParams.vpp.In.FrameRateExtD = 1;
// width must be a multiple of 16
// height must be a multiple of 16 in case of frame picture and a multiple of 32 in case of field picture
m_VPPParams.vpp.In.Width = MSDK_ALIGN16(m_VPPParams.vpp.In.CropW);
m_VPPParams.vpp.In.Height = (MFX_PICSTRUCT_PROGRESSIVE == m_VPPParams.vpp.In.PicStruct)?
MSDK_ALIGN16(m_VPPParams.vpp.In.CropH) : MSDK_ALIGN32(m_VPPParams.vpp.In.CropH);
// Output data
m_VPPParams.vpp.Out.FourCC = MFX_FOURCC_RGB4/*MFX_FOURCC_NV12*/;
m_VPPParams.vpp.Out.ChromaFormat = MFX_CHROMAFORMAT_YUV420;
m_VPPParams.vpp.Out.CropX = 0;
m_VPPParams.vpp.Out.CropY = 0;
m_VPPParams.vpp.Out.CropW = m_VPPParams.vpp.In.CropW/*/2*/; // Resize to half size resolution
m_VPPParams.vpp.Out.CropH = m_VPPParams.vpp.In.CropH/*/2*/;
m_VPPParams.vpp.Out.PicStruct = /*MFX_PICSTRUCT_FIELD_TFF*/MFX_PICSTRUCT_PROGRESSIVE;
m_VPPParams.vpp.Out.FrameRateExtN = 25;
m_VPPParams.vpp.Out.FrameRateExtD = 1;
// width must be a multiple of 16
// height must be a multiple of 16 in case of frame picture and a multiple of 32 in case of field picture
m_VPPParams.vpp.Out.Width = MSDK_ALIGN16(m_VPPParams.vpp.Out.CropW);
m_VPPParams.vpp.Out.Height = (MFX_PICSTRUCT_PROGRESSIVE == m_VPPParams.vpp.Out.PicStruct)?
MSDK_ALIGN16(m_VPPParams.vpp.Out.CropH) : MSDK_ALIGN32(m_VPPParams.vpp.Out.CropH);
m_VPPParams.IOPattern = MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
m_VPPParams.AsyncDepth = m_nAsyncDepth;
// Query number of required surfaces for decoder
mfxFrameAllocRequest DecRequest;
memset(&DecRequest, 0, sizeof(DecRequest));
sts = m_pmfxDEC->QueryIOSurf(&m_mfxVideoParams, &DecRequest);
MSDK_IGNORE_MFX_STS(sts, MFX_WRN_PARTIAL_ACCELERATION);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
// Query number of required surfaces for VPP
mfxFrameAllocRequest VPPRequest[2];// [0] - in, [1] - out
memset(&VPPRequest, 0, sizeof(mfxFrameAllocRequest)*2);
sts = m_mfxVPP->QueryIOSurf(&m_VPPParams, VPPRequest);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
// Determine the required number of surfaces for decoder output (VPP input) and for VPP output
nSurfNumDecVPP = DecRequest.NumFrameSuggested + VPPRequest[0].NumFrameSuggested;
nSurfNumVPPOut = VPPRequest[1].NumFrameSuggested;
// Allocate surfaces for decoder and VPP In
// - Width and height of buffer must be aligned, a multiple of 32
// - Frame surface array keeps pointers all surface planes and general frame info
mfxU16 width = (mfxU16)MSDK_ALIGN32(DecRequest.Info.Width);
mfxU16 height = (mfxU16)MSDK_ALIGN32(DecRequest.Info.Height);
mfxU8 bitsPerPixel = 12; // NV12 format is a 12 bits per pixel format
mfxU32 surfaceSize = width * height * bitsPerPixel / 8;
m_surfaceBuffers = (mfxU8 *)new mfxU8[surfaceSize * nSurfNumDecVPP];
m_pmfxSurfaces = new mfxFrameSurface1*[nSurfNumDecVPP];
MSDK_CHECK_POINTER(m_pmfxSurfaces, MFX_ERR_MEMORY_ALLOC);
for (int i = 0; i < nSurfNumDecVPP; i++)
{
m_pmfxSurfaces = new mfxFrameSurface1;
memset(m_pmfxSurfaces, 0, sizeof(mfxFrameSurface1));
memcpy(&(m_pmfxSurfaces->Info), &(m_mfxVideoParams.mfx.FrameInfo), sizeof(mfxFrameInfo));
m_pmfxSurfaces->Data.Y = &m_surfaceBuffers[surfaceSize * i];
m_pmfxSurfaces->Data.U = m_pmfxSurfaces->Data.Y + width * height;
m_pmfxSurfaces->Data.V = m_pmfxSurfaces->Data.U + 1;
m_pmfxSurfaces->Data.Pitch = width;
}
// Allocate surfaces for VPP Out
// - Width and height of buffer must be aligned, a multiple of 32
// - Frame surface array keeps pointers all surface planes and general frame info
width = (mfxU16)MSDK_ALIGN32(VPPRequest[1].Info.Width);
height = (mfxU16)MSDK_ALIGN32(VPPRequest[1].Info.Height);
bitsPerPixel = 32; // NV12 format is a 12 bits per pixel format
surfaceSize = width * height * bitsPerPixel / 8;
m_surfaceBuffers2 = (mfxU8 *)new mfxU8[surfaceSize * nSurfNumVPPOut];
m_pVppSurfaces = new mfxFrameSurface1*[nSurfNumVPPOut];
MSDK_CHECK_POINTER(m_pVppSurfaces, MFX_ERR_MEMORY_ALLOC);
for (int i = 0; i < nSurfNumVPPOut; i++)
{
m_pVppSurfaces = new mfxFrameSurface1;
memset(m_pVppSurfaces, 0, sizeof(mfxFrameSurface1));
memcpy(&(m_pVppSurfaces->Info), &(m_VPPParams.vpp.Out), sizeof(mfxFrameInfo));
m_pVppSurfaces->Data.B = &m_surfaceBuffers2[surfaceSize * i];
m_pVppSurfaces->Data.G = m_pVppSurfaces->Data.B + 1;
m_pVppSurfaces->Data.R = m_pVppSurfaces->Data.B + 2;
m_pVppSurfaces->Data.A = m_pVppSurfaces->Data.B + 3;
m_pVppSurfaces->Data.Pitch = width * 4;
}
// Initialize the Media SDK decoder
sts = m_pmfxDEC->Init(&m_mfxVideoParams);
MSDK_IGNORE_MFX_STS(sts, MFX_WRN_PARTIAL_ACCELERATION);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
// Initialize Media SDK VPP
sts = m_mfxVPP->Init(&m_VPPParams);
MSDK_IGNORE_MFX_STS(sts, MFX_WRN_PARTIAL_ACCELERATION);
MSDK_CHECK_RESULT(sts, MFX_ERR_NONE, sts);
}
Reards
Ankush
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ankush,
One thing to consider in your code is to use video memory instead of system so that there is no extra copy involved which will avoid CPU utilization.
m_mfxVideoParams.IOPattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
Same goes for the VPP IO Pattern as well
m_VPPParams.IOPattern = MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
Another imp thing from sys analyzer is to update the driver to the latest version,there has been lot of fixes with the latest drivers. So it is best to keep the system updated. You can find the check and download the driver from here - downloadcenter.intel.com I am hoping partial acceleration warning would not be seen after that.
Also, It would be of best interest if you use Media SDK 2014 R2 Release. In the past, the issues have been fixed by upgrading to latest driver and the Media SDK release.
If the problem still exists, then please send us the complete code in a file with the input and the directions how to run it.
Thanks,
-Surbhi
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page