Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

How to setup the value of "AsyncDepth",and how to calculate the number of surfaces used by vpp out & enc in?

Zheng_L_
Beginner
1,489 Views

What's the meaning of "AsyncDepth" parameter of "mfxVideoParam",is it means how many asynchronous tasks in a session? For example,I create a session has vpp(do color convert & image scale) and enc tasks,I should set "AsyncDepth" as 2(or 4?),if the session has dec,vpp & enc tasks,I should set it as 3,is it right?

From the SDK's doc said,if a session had more than one asynchronous tasks,every task should share the some surface pool to catch better performance,my session has vpp and enc task,so I allocate a surface pool for vpp out & enc in.From <<mediask-man.pdf>>,the number of the common surface pool should be calculated like below:
async_depth=4;
init_param_v.AsyncDepth=async_depth;
MFXVideoVPP_QueryIOSurf(session, &init_param_v, response_v);
init_param_e.AsyncDepth=async_depth;
MFXVideoENCODE_QueryIOSurf(session, &init_param_e, &response_e);
num_surfaces= response_v[1].NumFrameSuggested+response_e.NumFrameSuggested-async_depth; /* double counted in ENCODE & VPP*/

but the demo "sample_encode" calculate it in another way:
nEncSurfNum = EncRequest.NumFrameSuggested + MSDK_MAX(VppRequest[1].NumFrameSuggested, 1) - 1 + (m_nAsyncDepth - 1);

manual said should exclute "async_depth",but demo include "async_depth",which one is right? which formula should I use?

0 Kudos
6 Replies
Petter_L_Intel
Employee
1,488 Views

Hi Zheng,

The intent of AsyncDepth is to allow Media SDK to process several tasks (frames) in parallel. To do this, the user sets the desired AsyncDepth and implements task handling mechanism such as illustrated in the "sample_encode" sample. 

Asynchronous usage via the use of AsyncDepth is also illustrated in a more simplistic way as part of the Media SDK tutorial samples, http://software.intel.com/en-us/articles/intel-media-sdk-tutorial. I encourage you to take look at these samples, since they should help explain the intent and usage of AsyncDepth,

The number of surfaces needed for processing using an Asynchronous pipeline is admittedly a bit confusing. The reason is due to the meaning of AsynchDepth has changed in compared to the very early releases of Media SDK API. Thus the sample is trying to cover usage for both older and more recent behavior.  We plan to revisit this topic as part of sample improvements for the next release of Media SDK.

Regards,
Petter 

0 Kudos
Zheng_L_
Beginner
1,488 Views

Thanks Petter,I had download the tutorial samples.

0 Kudos
Zheng_L_
Beginner
1,488 Views

Hi,Petter :

I had read the code in "intel_media_sdk_tutorial_041813",and I think the meaning of "asyncDepth" is the max number of frames that a mfx task can be holded before I call "SyncOperation()" to get processing result,for example,if I set "asyncDepth" as 4 to an encoding task,I can call "EncodeFrameAsync()" for 4 times continuously,and then call "SyncOperation()" to get bitstream output,is it?What I understand is right?

My app that integrate MFX is a inline Capture&Encode application,it capture a yv12 image from a card every 40ms(25fps),and push it to a vpp task do csc & di operation,and then to do h264 encoding.In opposite,the tutorial's sample(include the samples in MediaSDK) are all offline programs,they read source images from a disk file,the file reading speed can be faster then encoding,decoding,or transcoding speed,so they set "asyncDepth" as 4 to let mfx task can hold more data(just like catch); But in my program,the source image is from a capture card every 40ms,in every capture cycle,there is only one image need to do processing,and I create a thread to do readout encoder output specially,so Can I set "asyncDepth" as 1,or just set it as zero(don't care the parameter)? Also,Can I only create one,or two encoder output buffer?

0 Kudos
Petter_L_Intel
Employee
1,488 Views

Hi Zheng,

Your interpretation is correct. And if you use several tasks you must also use several output bitstream buffers, as illustrated in the sample code.

But, based on your described use case it seems the most central performance metric is latency. Is that correct? If so, then I suggest you avoid using parallel processing and instead set AsyncDepth to 1, and also follow the other suggested encoder/decode configurations for low latency . This use case is illustrated in the Media SDK "videoconf" sample and in the low latency Media SDK tutorial samples.

Regards,
Petter 

0 Kudos
Zheng_L_
Beginner
1,488 Views

Hi Petter,

The central performance of my app is not latency,it write bitstream to a disk file,not to network,so latency is not problem;What I concern is how to set "asyncDepth" and the number of in/out buffer,then I can let the encoder(include vpp processing) running at its best condition,because my app will do at least 4 channels HD stream encoding,or one 4K stream encoding;

0 Kudos
Yabo_W_
Beginner
1,488 Views

Thanks Zheng and Petter,

This topic solved my application's problem. My problem is that the DecodeTimeStamp is non-monotonous after MFXVideoCORE_SyncOperation() calls duration QSV encode.

0 Kudos
Reply