We have a transcoding application that uses the mediaSDK to both decode and encode from / to H.264. We're seeing access violations (in the VS debugger output window) when the system has more than 16 cores, after which the mediaSDK engine just stops. Adjusting the number of cores via the bios to 16 or less and the application runs fine. (By "cores" here I mean the number of "CPU's" displayed in the Windows Task Manager's Performance Tab.)
I can run the sample decode application that comes with the mediaSDK with no errors with more than 16 cores, so I suspect we have resource issues.
However, has anyone else seen this type of behavior?
I attached the tracer output in case it helps.
From the log i noticed that you're not running in HW acceleration mode. Can you please share more information about the error you're encountering for a machine with 16 cores (logical cores I assume?).
And, what is the specific platform you are using? Xeon?
Yes logical cores.
To be clear 16 works OK, more than that I get access violations on the first call to MFXVideoDECODE_DecodeFrameAsync. It seems like the mediaSDK initialization spawns a thread for each core? and in the case of the 32 core system we get 16 access violation messages when the function call happens.
This has happened on two different customer machines:
HP Z820 with dual Intel Xeon E5 2650 (8 core) processors
HP Z820 with dual Intel Xeon E5 2660 V2 (10 core) processors.
Ok, thanks for clarifying regarding behavior and system configuration.
There were some changes to the SW encoder implementation for the 2014 SDK that may resolve/remove these access violation indications. Can you please try it to see if there is a difference.
So to confirm,you are no experiencing any erros in execution or crashes, right?
I will try the 2014 SDK as soon as possible. I'm waiting for a machine to arrive - I had to borrow a customer's to do the initial investigation.
Correct, the application runs without issue on 16 cores or less, and the only indication when we have more cores is that the access violations appear.
I received the machine and tried the 2014 SDK with the same results. Setting the cores to 16 in the BIOS works fine, setting to 32 cores and we get access violations. One customer had similar results when setting to 20 cores, i.e. access violations.
The Call Stack indicates that the problem occurs in the libmfxsw32.dll.
Thanks for letting us know the observed behavior when using Media SDK 2014.
Would you be able to explore if the same behavior can be reproduced using the Media SDK "sample_multi_transcode" sample?
I compiled and ran the Full Transcoding Sample 5.0.337.78585 with the machine set to 32 cores and I got the same results as our application, access violations.
When I set the machine to 16 cores the sample ran fine and produced correct output. Note I compiled as x86/win32 and the 2014 SDK.
Earlier I had run the 2013 SDK decode only sample and that worked.
I'm not sure where this issue would be in priorities for the mediaSDK developers, so for a workaround is it possible for us to limit the number of cores to 16 programmatically in the mediaSDK?
You mentioned earlier that, even though you you see the access violation event there is no impact on your application, such as crash etc. Did I get that right? If that is the case why do you need a workaround?
There is currently no way to limit the number of threads used by the SDK SW codec execution.
We do treat this issue report as high priority and there is an effort ongoing to find the root of the issue.
Sorry I wasn't clear on what happens in the application. When the access violations occur, the status back from the call to DecodeFrameAsync ends up being a MFX_WRN_DEVICE_BUSY forever and eventually the access violations stop, Our application still responds and there is no Windows type error that the application is not responding, etc.
With the 32 cores enabled we have tried setting the Processor Affinity to 1 core and that gets the access violations but does process the data.
If it helps, we can put our machine directly on the internet - outside of our network.
The work-around we currently have the customers with these machines limit the cores to 16 or less in the bios.
Thanks for your prompt response an attention.
John, thanks for clarifying. That makes it way more critical to fix this quickly.
I will make sure this issue gets high priority. I'll provide feedback here as soon as I have more info to share.