Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.
3056 Discussions

What's the best strategy to fully utilize computer with hardware/software encoding

Kelvin
Beginner
379 Views
Hi,

My objective is to fully utilize the computer to encode as many frames as possible using combination of graphics card & CPU. The input is already in decode NV12 format waiting to distribute to various threads below for encode, and there is no VPP stage. For each call to encode the buffer is immediate sync to wait for the result.

Assume the computer has N cores using Intel graphics card. Here are a couple of ways I can think of

(1) Create N+1 threads using MFX_IMPL_AUTO and share all sessions, hope that Media SDK is smart enough to distribute the software and hardware encoding. Encoding parameter set to use single thread only for all session.

(2) Create N+1 threads, one use MFX_IMPL_HARDWARE, and the rest N use MFX_IMPL_SOFTWARE with share sessions. Encoding parameter set to use single thread only for all session.

(3) Create 2 threads, one use MFX_IMPL_HARDWARE with encoding parameter set to use single thread, the other use MFX_IMPL_SOFTWARE with encoding parameter set to use N thread.

(4) { Fill in your own here :) }

Which one you think is the best ?

Thanks.


- Kelvin

0 Kudos
4 Replies
IDZ_A_Intel
Employee
379 Views
Hi Kelvin,

By graphics card you mean Intel Processor graphics, not a discrete card right?

Some comments on the options you listed:

(1) If Intel Media SDK is used with the AUTO implementation, the actual implementation (HW or SW) is selected when initializing the session. There is no logic with regards to load balance between HW and SW.

(2) This may be the best approach for what you're trying to do. However, if you want to perform multiple stream encode using the HW then you will likely need corresponding number of threads.
The trick to this approach is to know how much load you can put on HW, maintaining your target fps, without saturating the GPU. If your workloads are uniform this may just be a simple calculation (based on your own workload metrics) to determine the # of HW encodes that can be run simultaneously. If your workloads are not uniform then inferring GPU load will be complicated. Instead you may want to explore the new capabilities of Intel Graphics Performance Analyzer (GPA) 4.1 for which we have added API to query the GPU about the momentary load.

(3) No


Note that the NumThread parameter is deprecated starting form Media SDK 3.0. This parameter has no real relevance for the HW codec case. In the SW codec case, make sure to use Join and Disjoin to enable efficient sharing of Media SDK resources (avoiding threading oversubscription).

Regards,
Petter
0 Kudos
Kelvin
Beginner
379 Views
Hi Petter,

Yes, using Intel processor graphics.
Thanks for your advice. Response to (2) the input queue will feed workload to HW or SW threads whenever last task is done so workload should be balance automatically. That's why I think only one HW thread is good enough unless HW can do multiple encode in parallel.

Thanks !

- Kelvin
0 Kudos
IDZ_A_Intel
Employee
379 Views
The HW can certainly handle multiple encode sessions in parallel.
Regards,
Petter
0 Kudos
Kelvin
Beginner
379 Views
Hi Petter,

Good to know. Then I will have N SW threads & M HW threads to run simulation in order to finger out the optimize value o M ! If I know how many multiple encode sessions HW can do then I can set M equal to that.

Thanks a lot !

- Kelvin


0 Kudos
Reply