Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

Advices on decoder

Mihail_P_
Beginner
443 Views

Greetings

I`m working on a multithreaded H.264 decoder/renderer. It is based on sample_decode example with multithread adaptation. I`m restricted on the usage of Direct3D9 and 32 bits. Moreover, I need to have decoded frames in system memory. My configuration is Core i7-4790 (HD 4600, latest drivers), Windows 8.1 x64, INDE 2015 Pro, MediaSDK 6.0.0.388

I`ve tried several approaches:

1. IOPattern = MFX_IOPATTERN_OUT_VIDEO_MEMORY. Works good, I am able to decode and show 20+ streams of 1080p 25Hz H.264. But when it comes to having decoded images in system memory then I have problems. Locking surfaces via LockRect is extremely slow, I`m not able to have even 1 FullHD stream. I`ve redesigned it by using OffscreenPlainSurface combined with GetRenderTargetData as Microsoft suggests and was able to achieve something like 85-90 fps (3 streams works fine, when there`s 4 of them - decoding slows down). Is it possible to speed this up? And one more thing. While decoding 22-23 FullHD streams my decoder consumes about 600M of system memory. Trying to add any more streams leads to the crash in random location of the code, which I think is caused by the lack of some resources, but I have no clues of which ones. Any ideas about this?

2. IOPattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY. Works good while streams count is relatively small (~10-12). If streams count is bigger then my machine could hang in random moment without any signs and errors. When it comes to rendering then I have difficulties too. The main question is - how can I create Direct3D surface from mfxFrameData. Right now I`m using OffscreenPlainSurface, which contents I update through D3DLOCKED_RECT rect. Then I`m calling UpdateSurface to copy it to GPU memory and so on. But this approach involves a lot of memory copy operations and I can see very big CPU load on streams count more then 5 (comparing the case without rendering) Is there more efficient way to create Direct3dSurface from mfxFrameData?

0 Kudos
4 Replies
Mihail_P_
Beginner
443 Views

The most annoying part is that pc hangs while decoding into system memory. Is there any tip or solution?

0 Kudos
Sravanthi_K_Intel
443 Views

Hi Mihail, Sorry for the delayed response on this one. Unfortunately, DirectX-based questions are better suited for Microsoft forum, and not here. The MSDK provides you with APIs to copy data from system<->video memory, callback to the decoder logic among other things, but we do not provide support or expertise on how to use these surfaces to render the frames on screen. That is where DirectX comes in, and we leave it to the application developer to handle the player implementations.

Having said that, if you are looking at pure decode performance (without rendering or players), our AVC decoders can comfortably sustain >20 1080p streams real-time on the system you are looking at. But once you add the render/player logic, the concurrent streams that can be decoded will be most impacted by the player implementation. Our implementation of a simple player is available in sample_decode - and you are already familiar with it. Not sure if I can offer more help here.  

0 Kudos
Mihail_P_
Beginner
443 Views

Hi Sravanthi, thank you for your answer.

I`ve found the reason of freezes, it was not MSDK, so you can close this thread.

0 Kudos
Sravanthi_K_Intel
443 Views

Thanks for getting back to us Mihail - will close the thread.

0 Kudos
Reply