Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Multi-Channel video decoding with Media SDK.


Dear all, 

I've accomplished one channel video decoding and it works fine. 

However, I have no idea about how to develop Multi-Channel(16-ch) video decoding with Media SDK hardware acceleration concurrently.

Now, I implement it by exposing several threads and each thread to create it's own decode device, video session, and allocated buffer...etc. 

Unfortunately, somehow hardware decoder or session or buffer may create failed. (especially channel number exceed 12)

Is it caused by VGA resource running out? 

Could you provide some suggestion to develop 16-ch video decoding with Media SDK hardware acceleration?

I've noticed there is a "session join" mechanism, but don't know that how to use it.

(does it mean 15 child session to join 1 parent session? and create 16 hardware decoder according to these session)

Many thanks!

0 Kudos
4 Replies

Thanks for your question.  This is a gap in our examples which we're hoping to fill.  Media SDK hardware decode is very fast -- hundreds of FPS at HD resolution, proportionally more at lower resolutions.  You should be able to do many more than 16 simultaneous realtime decodes, especially with smaller resolutions.  The bottleneck isn't decode, but since raw surfaces are relatively large even heavily optimized copies to CPU memory are very expensive. The best case is to decode and composite directly to a larger surface/texture without ever leaving the GPU.  Joining sessions isn't required for this scenario.  The main reason for joining is to avoid thread oversubscription with multiple software sessions.  Please watch for more documentation on this topic in the future.



Dear Jeffery, 

Very much thanks for your reply. 

Today, by Intel sample decode testbed, I've done a experiment that fork several threads to launch several decode devices with it's own session, buffer, and others resource. 

However, somehow few decode devices would create failed if thread number exceed 12. 

Is it limitation for 32-bit process? (Since memory located for an OS process isn't enough)

Or is there anything I missed? 


Dear Jeffery, 

Let me clarify some potential confusing points. 

All video streaming channel's resolution are Full-HD 1920x1080.

Therefore, I want to playback 16 channels Full-HD concurrently by Intel Media SDK. 

Thank you : )


Dear Jeffery, 

This afternoon I keep trying to playback 16-channel Full-HD video in one process. (codec: H264. color format: YUY2)

I found out that if we create 16 processes to playback 16 video clips, each Media SDK hardware device could be created properly. 

But, if we try to create 16 Media SDK hardware device in one process, creation failure may be occurred.

The error code is MFX_ERR_MEMORY_ALLOC,  but the dynamic memory from GPU-Z is just 700MB in used. (Max is around 1024M)

The video memory surface # for decoding and rendering is 7.(nSurfNum = 7) 


    Request.NumFrameMin = nSurfNum;

   Request.NumFrameSuggested = nSurfNum;

    sts = m_pMFXAllocator->Alloc(m_pMFXAllocator->pthis, &Request, &m_mfxResponse);
    nSurfNum = m_mfxResponse.NumFrameActual;

    sts = AllocBuffers(nSurfNum);


My platform's VGA is Intel IvyBridge, CPU is Intel core i7-3770 @3.4GHz.

Please kindly give me some hints. Many thanks!