Media (Intel® Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools like Intel® oneAPI Video Processing Library and Intel® Media SDK
Announcements
The Intel Media SDK project is no longer active. For continued support and access to new features, Intel Media SDK users are encouraged to read the transition guide on upgrading from Intel® Media SDK to Intel® Video Processing Library (VPL), and to move to VPL as soon as possible.
For more information, see the VPL website.

VAAPI MSDK CPU usage too heavy

duncanchou
New Contributor I
2,740 Views

Hello Sir,

We decoded the h264 video file to RGB by sample_decode of  mediasdk, we found the CPU usage is  too heavy 100.7 %, is it correct? or some setting  we missed?

duncanchou_0-1600825796325.png

 

 

Thanks

Michael Wu

0 Kudos
1 Solution
duncanchou
New Contributor I
2,640 Views

Hello Mark,

I  understand them clearly from your answer now.

Thank you for your help.

Michael Wu

View solution in original post

0 Kudos
7 Replies
Mark_L_Intel1
Moderator
2,719 Views

Hi Michael,


It seems you are running on Open Source Media SDK 20.2.0 or 20.2.1. Yes I think the CPU usage is too heavy so not sure if this related to I/O or hardware codec enabling.


What's your environment, a Linux host, VM or Docker?

What's the Intel processor are you using?


You can also try followings:

  • Remove the output see if the CPU usage improved:./sample_decode h264 -i <input.h264> -rgb4 -hw -vaapi
  • Change the output format: ./sample_decode h264 -i <input.h264> -o output.yuv -hw -vaapi
  • Enable GPU copy: /sample_decode h264 -i <input.h264> -o output.yuv -hw -vaapi -gpucopy::on


Mark


0 Kudos
duncanchou
New Contributor I
2,708 Views

Hello Mark,

We found remove the "-vaapi" and added  "-gpucopy::on" parameter, the performance will be improved and very good. 

BTW

What difference between with and without  "-vaapi"?

And 

#What's your environment, a Linux host, VM or Docker?

====> Linux host

#What's the Intel processor are you using?

====> Atom E3950

The attached is some test pictures. (Test_Record_0924_1.doc) 

 
 

Thanks for your support.

Michael Wu

 
0 Kudos
Mark_L_Intel1
Moderator
2,699 Views

Hi Michael,


Thanks for the detailed test results and comparison.


It is good to see that performance got improved but I am still not understand why removing "-vaapi" will improve it. The fact that you are using Apollo Lake(E3950) should limited the usage of video memory. This is '-vaapi' mean: "work with vaapi surfaces". Here it should means video memory, this should improve the copying speed in geneory, but let me ask dev team.


Did you also try to remove "-rgb4" and see if the performance improved?


By the way, you can check the online sample documentation here by going to "sample" subdirectory:

https://github.com/Intel-Media-SDK/MediaSDK/tree/master/doc


Mark


0 Kudos
duncanchou
New Contributor I
2,680 Views

Hello Mark,

If we remove the -rgb4, a little improved the performance, 

FPS is from 15 to 22, CPU usage is from 100 to 99....

The attached is our testing picture Test_Record_0925.doc . 

Thanks

Michael Wu

0 Kudos
Mark_L_Intel1
Moderator
2,652 Views

Hi Michael,


Thanks for the testing, it seems -rgb4 is not the main bottleneck here.


For your question: "What difference between with and without "-vaapi"?" We have following analysis on sample_decode:

The expected results (from best to worst) are:

1) “-vaapi” – output surface in video memory without touching (e.g. dumping to file) them on CPU

2) “-o” – output surface in system memory with touching them on CPU (copy video->system memory is inside MSDK).

3) “-o -vaapi” – output surface in video memory with touching them on CPU (copy video->system memory is on app level).


So your first result is expected, since it is the worst cast in above list.


Mark


0 Kudos
duncanchou
New Contributor I
2,641 Views

Hello Mark,

I  understand them clearly from your answer now.

Thank you for your help.

Michael Wu

0 Kudos
Kimi1
Beginner
2,221 Views

Hi Mark,

 

I have the same problem. I want to ask you about the third case.(“-o -vaapi” – output surface in video memory with touching them on CPU (copy video->system memory is on app level)

In this case,How can we improve copy efficiency from video memory to system memory?  (copy video->system memory is on app level) Because we should use the data in system memory.

Thanks

James

0 Kudos
Reply