We are using latest Intel Media SDK (2013 release) to do the following:
We are decoding 32 independent H.264 streams (1080p) using 32 threads running in the same process. We are using separate pipeline for each stream.
However after 20 -21 streams, allocator is not able to allocate video memory.
We are observing that Alloc() method returns error code as -4 (MFX_ERR_MEMORY_ALLOC).
We have given AsyncDepth=1. It is a 64 bit process running in a i7 4770K machine. Still we are hitting this limit.
We are suspecting that this limitation is due to lack of available graphics memory.
The latest media SDK is 2014, not 2013. That's a lot of 1080p streams to be decoding in parallel, I'm impressed you can do this at all o.0. Anyhow, you might be able to change the amount of memory available to the iGPU in the BIOS.
Yes, I believe you are hitting the limit of Video Graphics memory. We are consistently working to improve memory use. I'll discuss you usage model with some engineers and report back here, but I believe single vs. multiple memory pool will not affect much.
Given the bottleneck is memory, I do not believe you can expect to scale just number of streams and framerate. Frames from any stream need to remain in memory for use by other frames, regardless of how fast they are needed to be used, so when working with 40 streams, you will see frames in memory to support 40 streams, whether they are needed at 15 or 30 fps.
I suspect this won't help you, but I'm going to ask anyway just in case it does.
Is your application a low latency one? Do you need to get your frames out of the decoder in 50, 100, 200 ms?
Or could you live with your frames in 1000, 2000, 3000 ms latency?
You may have well thought all this through, and it may not be a possibility, but if your application does not require low latency, you could potentially multiplex different streams through a single decoder. You would need to do some h.264 parsing, and pay attention to other details, such as the decoder format. You may also have to make sure you have closed GOP streams.
Basically, you would parse your bitstreams into chunks delineated by the h264 PPS/SPS/IDR header through the next PPS/SPS/IDR.
it could be some work, but if your streams meet some conditions, it is something to think about. [you might have to insert PPS/SPS also, etc]