Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Too high CPU load when using DG1 for video encoding

OTorg
New Contributor III
2,869 Views

Hi,

We have discovered too high CPU consumption when using discrete DG1 for video encoding (compared with intergated graphics UHD 630).


Absolutely identical task was executed first on the UHD630, then on the DG1. Execution was monitored by VTune 2021.3.0.
Here are screenshots for comparison:

igpu-dgpu.png
The task was as follows. 8 live uncompressed SD-video streams (720x576i50, 4:2:0 NV12) were encoded in realtime by GPU into h264. Input video frames were supplied to encoders via d3d11-videomem-surfaces. All encodings were performed within one win32-process, mfxSessions were joined. Encoding duration was 60 seconds.

Issues was observed with gfx-driver versions 9466, 9316, 9039 and 9667. Other versions haven't been tested. Screenshots are from version 9667, running on Supermicro X12SAE, i7-10700, ASUS DG1-4G.


Below is a screenshot of the most CPU-consuming area (DG1):

dgpu_hardest_asm.png

 

Perhaps the reason is inefficient access to video memory. When I reduce the number of DG1's pcie-lines from 8 to 4, the CPU consumption is roughly doubled.

 

When I run such task on two DG1 cards simultaneously (in parallel), the CPU load becomes 100% and encoding can't keep up realtime.

 

I can't post here my application on which test were made, because it is too massive. But you can reproduce issue using native imsdk samples, here is a reproducer:

https://drive.google.com/file/d/1Z6m4kYcpk6lhurI4S8AqHRG7IfT3mPqP/view?usp=sharing

 

0 Kudos
1 Solution
OTorg
New Contributor III
417 Views

UPD:

I saw BSODs on Win10 Iot Enterprise LTSC 2019 (windows version 1809).

But specification of 9955-driver says it need windows version 2004+.

So, I've tested it on Win10 Pro 2004, and didn't catch any BSODs.

 

Can't test it on Win10 Iot Enterprise LTSC 2021 RTM, because it isn't avaiable yet for me.
Hope 9955-driver will work normally on it too.

 

So, I can tag the topic as solved.

View solution in original post

30 Replies
Mark_L_Intel1
Moderator
625 Views

Hi Olek,

 

You are partly right, I am focusing on support and not investigation, so I am transferring this request to dev team. The previous post was to make sure I had all the correct context so I can tell them the situation. My apology that I didn't explain explicitly and I can understand your frustration.

 

I had a discussion with dev team last week and they had a difficulty to find the Icelake + DG1 platform. Their tests are based on TigerLake + DG1 platform which was enabled with DeepLink. In general, the high CPU utilization is mainly caused by the memory sharing at driver layer which is related to DeepLink, this is why I mentioned it.

 

I saw your suggestion, if you can offer remote access and dev team agree. We could set up a session. If you agree, I will send your email address in the registration to dev team so they can contact you direct, does this work?

 

Mark

OTorg
New Contributor III
602 Views

Hi Mark,

Sorry for late answer, I was away for 2 weeks.

Yes, you can send my email address to dev team.

Mark_L_Intel1
Moderator
442 Views

Hi Olek,


Sorry for the late response.


At the early stage, I didn't quite understand DeepLink only supports now platforms. After discussion with dev team, the told me IceLake does't have this feature enabled.


So it doesn't make sense to do further debugging. My apology for the ambiguous messages, I will close this if you don't have further questions.


Mark Liu


OTorg
New Contributor III
432 Views

Hi Mark,


You said a certain DeepLink technology isn't supported on IceLake.

But how does this relate to solving the problem I described?

Video driver consumes a lot of CPU when interact with DG1 card.
I would like to sell DG1-based solutions to my clients, but I can't do it due to a bug. And we have not made any progress during the quarter, because we cannot even properly translate the problem to the developers:(

 

Mark_L_Intel1
Moderator
429 Views

Hi Olek,


Thanks for the quick response.


Sorry I didn't explain the dev team response very well. As part of DG1 support, we only planned for following platform:

  • Tiger Lake U + DG1 onboard
  • Coffee Lake S or Comet Lake S + DG1 add-in card.


As you can see Ice Lake is not on the list, so the CPU issue you reported is the consequence.


Apology again for not clear communication about this issue.


Mark Liu


OTorg
New Contributor III
407 Views

Mark,

Is the i7-10700 processor (on which I see a high load) belongs to Comet Lake S platform? (smiley)

OTorg
New Contributor III
320 Views

Аnd silence fell... :))

OTorg
New Contributor III
151 Views

Hi,

I've tested gfx-driver version 9955.

Problem of high CPU load has been fixed. Finally!

Tested on:
- Supermicro X12SAE, i7-10700, two ASUS DG1-4G cards
- ASUS Z590-E, i7-11700, two ASUS DG1-4G cards

But another issue arose.

System can crash into a BSOD when exiting decoding/encoding application.
Error code is VIDEO_MEMORY_MANAGEMENT_INTERNAL.

I'll collect the necessary information and open a new topic about this.

OTorg
New Contributor III
418 Views

UPD:

I saw BSODs on Win10 Iot Enterprise LTSC 2019 (windows version 1809).

But specification of 9955-driver says it need windows version 2004+.

So, I've tested it on Win10 Pro 2004, and didn't catch any BSODs.

 

Can't test it on Win10 Iot Enterprise LTSC 2021 RTM, because it isn't avaiable yet for me.
Hope 9955-driver will work normally on it too.

 

So, I can tag the topic as solved.

View solution in original post

AthiraM_Intel
Moderator
33 Views

Hi,


Glad to know that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel. 



Thanks



Reply