Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Low latency H264 decoding not working after 2nd or later sessions in multi-threaded decoding apps

Joon_C_1
Beginner
316 Views

I have been developing a low latency h264 player with mulitple sessions (aiming at 36 sessions, 1280x720p/30Hz) using 2013 SDK.

Our code is based on sample_decode. And  I followed some tips about low latency from the forum (set AsyncDepth by 1, DataFlag by MFX_BITSTREAM_COMPLETE_FRAME).

But I found low latency works only in the 1st thread. In the second and later threads, decoding dealys in 17 or higher frames.

Any fixes to the problem?

Thanks for looking into this in advance.

Joon

0 Kudos
4 Replies
Petter_L_Intel
Employee
316 Views

Hi Joon,

For multi-channel usages you will not achieve the same low latency as you get when running a single session. This is not an issue but a direct result of the fact that applications share a finite amount HW resources in the graphics device. If a single HW accelerated workload is running it can utilize all of the HW resources. If more than one workload, then the underlying framework will share the resources evenly between the workloads, thus each individual workload will execute slower (resulting longer latency).

Regards,
Petter 

0 Kudos
Joon_C_1
Beginner
316 Views

Thanks Petter.

At the beginning of development  I run muliple excutables of low latency player with a single session (up to 36 sessions totally).

Each H264 stream of a single frame was feeded on a rate of 30 Hz for a single session not to use too much HW resourse.

The results were OK. Each session (on a different processor) shows 2 frames decoding delay. Video was fine.

GPU usage was about 26% on the monitoring tool. So I thought Intel GPU has enough capacity for the case.

Then I moved into multi-threaded player and got the different results on the above post.

I am wondering, are HW resources devided on processors and on threads in different ways? Or are there some limits in using threads?

Regards,

Joon

0 Kudos
Petter_L_Intel
Employee
316 Views

Hi Joon,

There are no issues using session via separate processes (like when using individual executables) or via threads. The performance should be close to the same.

I suspect you may have some issues with your threaded implementation, possibly a shared resource between the threads that results in the delay. I suggest you revisit your code to make sure each session executes without any such dependencies.

There is a sample part of the Media SDK tutorial which showcases how to launch sessions in separate threads. It might be helpful to you.

Regards,
Petter 

0 Kudos
Joon_C_1
Beginner
316 Views

Thanks. I am going to revisit my code.

Regards,

Joon

0 Kudos
Reply