- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
There was a problem in implementing H.264 multi-channel decoding.
We tried to measure the multi-channel decoding performance by partially modifying the media sdk 'sample_decode'.
The application performs one session creation and decoding operation in one thread.
When the number of threads exceeds 48, access violation occurs in 'MFXVideoDECODE_Init' on a random session.
The 'MFXVideoDECODE_Init' return value is 'MFX_ERR_MEMORY_ALLOC' when ignoring the error and proceeding.
(Total Memory : 16G, Used about 6G)
What is the maximum number of multi-channel decoding possible?
If there are no limitations, what are the causes of such problems?
Is the session thread safe?
[System]
OS : Windows 7 Professional
CPU : i7-4790
Graphic : Intel(R) HD Graphics 4600 (driver version : 10.18.14.4578, date : 2017-01-04)
Memory : DDR3 8G x 2 (Total 16G)
SDK : Intel(R) Media SDK 2016 R2
Compiler : Visual Studio 2008
[Input]
H.264 1080P raw data (One file is copied to system memory and shared by all threads )
* System Analyzer result is attached.552070
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you are not permitted to change settings to 4.0 when input bitstream is encoded with 4.2. so it is correct behavior when decoder returns incompatible error. but anyway Level4.2 and 4.0 should allocate the same DPB size, no need to transcode in your case.
in regard numThreads, here is manual says "Depricated" p.142 https://software.intel.com/sites/default/files/managed/47/49/mediasdk-man.pdf
"NumThread Deprecated; Used to represent the number of threads the underlying implementation can use on the host processor. Always set this parameter to zero."
do not know why it may make a difference.
when you say decoding surfaces created in system memory - do you init a software decoder implementaion or hardware? If a hardware then it will be an additional copy to your system memory surface but internally hardware decode will keep using video memory surfaces for DPB frames.
As for you CPU, on the Intel's product page said https://ark.intel.com/products/80806/Intel-Core-i7-4790-Processor-8M-Cache-up-to-4_00-GHz
If it is not memory limit, it can be internal handles limit.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess you gave the number. If your asyncdepth = 1 & AVC @Level 5.1 then each decoder requires 16 frames in DPB. Then each 1080p decoder consumes ~75MB video memory, it gives 54 decoders fits to 4GB video memory. If your asyncdepth higher than 1 then it consumes more memory. Try to decrease Level to 4, then it relaxes DPB buffer size requirements and make asyncdepth = 1. check if number of decoders become higher. and check your BIOS setting - what is the limit for video memory
Yours, ViCue
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your quick reply.
I run MFXVideoDECODE_DecodeHeader using an H.264 bitstream after creating the session, the output 'mfxVideoParam.mfx.CodecLevel' is 42.
And the surface created for decoding is System Memory. (Not Video Memory)
My system memory is large enough. (Total 16G, Used about 6G)
Change 'mfxVideoParam.mfx.CodecLevel' to 40 and call 'MFXVideoDECODE_Init', the return is 'MFX_ERR_INCOMPATIBLE_VIDEO_PARAM'.
How do you reduce the level to 4.0?
Should the input bitstream be transcoded?
Currently used asyncdepth is 1.
When mfxVideoParams.mfx.NumThread was set to 0, the maximum decoding channel did not exceed 32.
However, When mfxVideoParams.mfx.NumThread is set to 1, up to 48 channels can be decoded.
As you suggest, I tested the video memory after increasing it from BIOS to 1G, but the result is the same.
The input bit stream is changed to D1(704x480) and tested.
The maximum decoding channel does not exceed 48 channels.
It is hard to judge whether this is a problem of memory capacity.
Is there any other consideration?
Thanks ViCue
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you are not permitted to change settings to 4.0 when input bitstream is encoded with 4.2. so it is correct behavior when decoder returns incompatible error. but anyway Level4.2 and 4.0 should allocate the same DPB size, no need to transcode in your case.
in regard numThreads, here is manual says "Depricated" p.142 https://software.intel.com/sites/default/files/managed/47/49/mediasdk-man.pdf
"NumThread Deprecated; Used to represent the number of threads the underlying implementation can use on the host processor. Always set this parameter to zero."
do not know why it may make a difference.
when you say decoding surfaces created in system memory - do you init a software decoder implementaion or hardware? If a hardware then it will be an additional copy to your system memory surface but internally hardware decode will keep using video memory surfaces for DPB frames.
As for you CPU, on the Intel's product page said https://ark.intel.com/products/80806/Intel-Core-i7-4790-Processor-8M-Cache-up-to-4_00-GHz
If it is not memory limit, it can be internal handles limit.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much, Vicubi.
I've found that it works well on more than 90 channels on other systems (i7 6700).
Have a good day~
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page