I have been working on integrating the Media SDK / GPU into our real-time streaming media server. Most of the functionality is now working at low density (2 to 4 parallel processing channels) and we are attempting to ramp things up. Unfortunately, we start seeing some serious issues as we hit 6 and 8 channels. The device continuously reports 'busy' and the error 'kernel drm stuck on bsd ring' was observed in /var/log/messages.
Searching for this in the forums yields nothing. Searching globally I found some similar issues pointing to the kernel version. Any insight from the minds at Intel?
Thanks - Bob / Dialogic
The system I am using . .
bogomips : 7200.18
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 60
model name : Intel(R) Xeon(R) CPU E3-1285 v3 @ 3.60GHz
stepping : 3
microcode : 0x1c
cpu MHz : 3437.296
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
The actual error prints in the messages log . . .
Feb 5 07:51:07 sut-1300 kernel: [drm] stuck on bsd ring
Feb 5 07:51:07 sut-1300 kernel: [drm] GPU HANG: ecode 1:0xa8ffdf3e, in ssp_MDRSC Compo , reason: Ring hung, action: reset
Feb 5 07:51:09 sut-1300 kernel: [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp of
A few related posts indicate there may be some correlation to the RC6 settings. Here's what the system currently has . . .
[root@sut-1300 ~]# dmesg | grep RC6
[ 3.753211] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off
We have seen similar issue in the past, our next release has a fix for this issue which is schedule in a week or more. Can you try on the latest release and confirm if you are able to reproduce or send us the reproducer to try at our end.
Another question I had, are you transcoding HD or 4K AVC streams real time?
Media Server Studio 2016 is out, please see this note to find instructions on download and find out what's new in this blog. Also, let us know after updating the version you still see the above reported error.
I seem to have resolved the issue with some refactoring of the decode methods. I have also installed the 2016 release and it is working as well as the 2015 release.