Media (Intel® oneAPI Video Processing Library, Intel Media SDK)
Access community support with transcoding, decoding, and encoding in applications using media tools from Intel. This includes Intel® oneAPI Video Processing Library and Intel® Media SDK.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

VAAPI MSDK CPU usage too heavy

duncanchou
New Contributor I
862 Views

Hello Sir,

We decoded the h264 video file to RGB by sample_decode of  mediasdk, we found the CPU usage is  too heavy 100.7 %, is it correct? or some setting  we missed?

duncanchou_0-1600825796325.png

 

 

Thanks

Michael Wu

0 Kudos
1 Solution
duncanchou
New Contributor I
754 Views

Hello Mark,

I  understand them clearly from your answer now.

Thank you for your help.

Michael Wu

View solution in original post

7 Replies
Mark_L_Intel1
Moderator
833 Views

Hi Michael,


It seems you are running on Open Source Media SDK 20.2.0 or 20.2.1. Yes I think the CPU usage is too heavy so not sure if this related to I/O or hardware codec enabling.


What's your environment, a Linux host, VM or Docker?

What's the Intel processor are you using?


You can also try followings:

  • Remove the output see if the CPU usage improved:./sample_decode h264 -i <input.h264> -rgb4 -hw -vaapi
  • Change the output format: ./sample_decode h264 -i <input.h264> -o output.yuv -hw -vaapi
  • Enable GPU copy: /sample_decode h264 -i <input.h264> -o output.yuv -hw -vaapi -gpucopy::on


Mark


duncanchou
New Contributor I
822 Views

Hello Mark,

We found remove the "-vaapi" and added  "-gpucopy::on" parameter, the performance will be improved and very good. 

BTW

What difference between with and without  "-vaapi"?

And 

#What's your environment, a Linux host, VM or Docker?

====> Linux host

#What's the Intel processor are you using?

====> Atom E3950

The attached is some test pictures. (Test_Record_0924_1.doc) 

 
 

Thanks for your support.

Michael Wu

 
Mark_L_Intel1
Moderator
813 Views

Hi Michael,


Thanks for the detailed test results and comparison.


It is good to see that performance got improved but I am still not understand why removing "-vaapi" will improve it. The fact that you are using Apollo Lake(E3950) should limited the usage of video memory. This is '-vaapi' mean: "work with vaapi surfaces". Here it should means video memory, this should improve the copying speed in geneory, but let me ask dev team.


Did you also try to remove "-rgb4" and see if the performance improved?


By the way, you can check the online sample documentation here by going to "sample" subdirectory:

https://github.com/Intel-Media-SDK/MediaSDK/tree/master/doc


Mark


duncanchou
New Contributor I
794 Views

Hello Mark,

If we remove the -rgb4, a little improved the performance, 

FPS is from 15 to 22, CPU usage is from 100 to 99....

The attached is our testing picture Test_Record_0925.doc . 

Thanks

Michael Wu

Mark_L_Intel1
Moderator
766 Views

Hi Michael,


Thanks for the testing, it seems -rgb4 is not the main bottleneck here.


For your question: "What difference between with and without "-vaapi"?" We have following analysis on sample_decode:

The expected results (from best to worst) are:

1) “-vaapi” – output surface in video memory without touching (e.g. dumping to file) them on CPU

2) “-o” – output surface in system memory with touching them on CPU (copy video->system memory is inside MSDK).

3) “-o -vaapi” – output surface in video memory with touching them on CPU (copy video->system memory is on app level).


So your first result is expected, since it is the worst cast in above list.


Mark


duncanchou
New Contributor I
755 Views

Hello Mark,

I  understand them clearly from your answer now.

Thank you for your help.

Michael Wu

View solution in original post

Kimi1
Beginner
335 Views

Hi Mark,

 

I have the same problem. I want to ask you about the third case.(“-o -vaapi” – output surface in video memory with touching them on CPU (copy video->system memory is on app level)

In this case,How can we improve copy efficiency from video memory to system memory?  (copy video->system memory is on app level) Because we should use the data in system memory.

Thanks

James

Reply