Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.
3056 Discussions

Decompress H.264 1080p stream into NV12 960 x 540 resolution

KK
Beginner
275 Views

We are planning to use latest Intel Media SDK (2013 release) to decompress H.264 1080p stream into NV12 960 x 540 resolution.

Please confirm if VPP is required to resize this into half or if decoder alone is sufficient to get the output stream in reduced resolution (960 x 540).

If possible, what decoder parameter must be used in this case?

0 Kudos
5 Replies
Petter_L_Intel
Employee
275 Views

Hi,

VPP must be used after the decode stage to perform resolution scaling. The components can be asynchronously pipelined for optimal performance.

Regards,
Petter

0 Kudos
KK
Beginner
275 Views

Hi Petter,

Thanks for the clarification. We have another query...

We are decoding 16 independent H.264 streams (1080p) using 16 threads running in the same process.

However after 9 - 10 streams, allocator is not able to allocate video memory. We are observing that Alloc() method returns error code as -4 (MFX_ERR_MEMORY_ALLOC).

Whereas same is possible when we run each stream from separate process. Is there any limitation per process. Please suggest.

regards,

KK

0 Kudos
Petter_L_Intel
Employee
275 Views

Hi,

I suspect the limitation you encounter is due to lack of available graphics memory (not your threading approach). If you are using 32 bit OS, a switch to 64 bit OS will free up more graphics memory resources. Also, you may explore slightly modifying the decode configuration, such as setting AsyncDepth=1, to decrease the number of internally buffered surfaces.

Regards,
Petter

0 Kudos
KK
Beginner
275 Views

Thanks Petter.

Current Configuration that we are using is:

  • 64 bit OS with AsyncDepth=1.
  • We are running the application on  i7-4770K processor with Intel HD Graphics 4600 having following Graphics memory:
    • Total available graphics memory : 1760 MB
    • Dedicated Video Memory : 128 MB
    • Shared Video Memory: 1632 MB

Problem Statement:

We have modified Intel SDK’s ‘sample_decode’ so that each salvo (window) will run from the same process but from separate dedicated thread. Totally we want to run 16 salvos (windows).

DecRequest.NumFramesSuggested is 6 frames while calling m_pMFXAllocator->Alloc(…).

So this is getting called 16 times (one for each session). But it fails after 9-10th instance.

We further debugged and found that m_decoderService-> CreateSurface(…) method is failing with MFX_ERR_MEMORY_ALLOC.

We have also observed that if m_pMFXAllocator->Alloc(…) is called for 96 frames in one try, then it is able to allocate. So we are assuming memory availability should not be an issue. Please suggest.

It seems if we have multiple allocators one for each window, the above problem occurs whereas if we have single allocator have all surfaces allocated via the same allocator, the above problem should not happen. Please confirm. If true, please explain how to use same allocator for all sessions and windows.

Regards,

KK

0 Kudos
Petter_L_Intel
Employee
275 Views

Hi,

Based on your description it's clear that the DirectX framework, at the point of calling CreateSurface, determines that there is not enough available memory. This is outside the control of Intel Media SDK.

One thing you could try is to encapsulate the DirectX and allocation part of the initialization stage and treat it as a shared resource locked by a critical section. It may help DirectX handle underlying resources in a more efficient way.

Also, if not already, try reusing (if feasible for your application) the same DirectX device for all the concurrent workloads/windows. Doing this will save you some memory.

Regards,
Petter

0 Kudos
Reply