We are using Intel Media SDK pipeline to Decompress H.264 1080p stream into NV12 (YUV 420) and then convert this to RGB4 (1080p).
We are using one pipeline for each stream and it is initialized to run using Hardware Library and VIDEO_MEMORY.
Currently we are running it as a 32 bit process on a i7 4770K machine.
We are aware that this would hit a memory limit once we reach ~ 6 streams and we cannot initialize further pipelines using VIDEO_MEMORY.
What we are not clear is that, even though it is initialized to use VIDEO_MEMORY we find that the RAM usage is ~ 1.5 GB (private bytes was observed using Perfmon).
Beyond this, as a fallback mechanism we wanted to use SYSTEM_MEMORY for further streams.
However we observed that, for any new stream after this, Init() fails even when we try to initialize Media SDK pipeline with Software Library (error returned is MFX_ERR_UNSUPPORTED).
On a trial basis, we tried to use existing IPP libraries after a point where Media SDK Init() starts failing (i.e.., after ~ 6 streams). We observed that even new operation starts failing here.
When we run this as 64 bit process, we hit the memory limit and face same bottleneck after ~20 or ~21 streams.
We are unable to understand the following:
- Even when the pipeline is initialized to use VIDEO_MEMORY why is the RAM getting used (~ 1.5 GB for 6 streams)?
- We want to scale upto 64 parallel 1080p streams. We want to use VIDEO_MEMORY and when it hits the limit we want to fall back to SYSTEM_MEMORY, how can this fallback be achieved?
- Is it possible to decode 64 parallel 1080p streams using Media SDK?
- We are using i7 4770K machine which has 128 MB Dedicated Video Memory and 1632 MB Shared System Memory. How does this memory get used by Media SDK on a 32 bit process and a 64 bit process.
The current driver has a known issue when using System Memory, and you may want to check behavior with the latest "Beta" driver available on intel.com (assuming you are using Intel 3rd or 4th generation Core Processor supported by the Beta graphics driver).
Also, default configuration of a single MediaSDK session is maximized for performance of a single stream, including number of buffers used. For example, setting ASyncDepth = 1 will minimize unnecessary buffering.
Also, the usage of Video and System memory can be very different when using D3D9 vs. D3D11.