- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I run sample_decode_x11 to decode a 1080P H264 stream with 25 fps and with -r option. It shows 92.7% cpu usage. My
computer cpu information : Intel(R) Core(TM) i5-4590 CPU @ 3.30GHZ.
Why does it have such high cpu usage. Which part consumes such high cpu usage.
Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First - can you give some more details on the experiment you are running? You can use the following format for that - https://software.intel.com/en-us/forums/topic/531083
May I ask why you are using X11 and not DRM method instead? We highly recommend using DRM, and you can find more information on that here - https://software.intel.com/en-us/articles/using-drmserver-with-media-sdk-for-linux-servers-applications
We do not expect (and have not observed) such high CPU usage while running decode sample - we expect it to be quite minimal on HW accel systems (comfortably <10%) with the render option. If you can send us some more details using the format above, we can try to identify the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Processor Type: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHZ.
Driver Version: MediaSDK version 1.11
Operating System: CentOS Linux release 7.0.1406 (Core)
Media SDK System Analyzer: This will give above three information and more about the system related capabilities
Quick Reproducer Code: sample_decode_x11
Concise Description of the Issue:
Priority: High
Input File: The file is a 1080p h264 stream. The size is very large(if you really want, i will think ways to grab some part of it and post them on the forum)
Tracer log(if required):
libva info: VA-API version 0.35.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
Decoding Sample Version 0.0.000.0000
Input video AVC
Output format YUV420
Resolution 1920x1088
Crop X,Y,W,H 0,0,0,0
Frame rate 0.00
Memory type d3d
MediaSDK impl hw
MediaSDK version 1.11
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The second question: why i use X11 instead of DRM,
The answer is that i want to display the video. DRM can not display even i give -r option in the command line. I don't know why?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Below is the output of top -d 1:
top - 16:55:51 up 5 days, 6:48, 4 users, load average: 0.43, 0.22, 0.12
Tasks: 202 total, 1 running, 201 sleeping, 0 stopped, 0 zombie
%Cpu(s): 25.2 us, 3.8 sy, 0.0 ni, 71.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 3753612 total, 3245472 used, 508140 free, 2108 buffers
KiB Swap: 3948540 total, 0 used, 3948540 free. 1978088 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12585 sample 20 0 301816 13796 5596 S 96.6 0.4 0:05.96 sample_decode_x
10726 sample 20 0 1571288 113684 33732 S 10.0 3.0 0:27.70 gnome-shell
10257 root 20 0 153268 15992 7208 S 7.0 0.4 0:17.52 Xorg
11102 sample 20 0 629652 20740 12256 S 2.0 0.6 0:05.83 gnome-terminal-
10629 sample 20 0 999544 26160 15104 S 1.0 0.7 0:01.00 gnome-settings-
1 root 20 0 143232 6948 3776 S 0.0 0.2 0:12.50 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.25 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.14 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root rt 0 0 0 0 S 0.0 0.0 0:00.11 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3
13 root 20 0 0 0 0 S 0.0 0.0 1:18.69 rcu_sched
14 root 20 0 0 0 0 S 0.0 0.0 0:37.20 rcuos/0
15 root 20 0 0 0 0 S 0.0 0.0 0:14.72 rcuos/1
16 root 20 0 0 0 0 S 0.0 0.0 0:20.27 rcuos/2
17 root 20 0 0 0 0 S 0.0 0.0 0:17.39 rcuos/3
18 root rt 0 0 0 0 S 0.0 0.0 0:01.76 watchdog/0
19 root rt 0 0 0 0 S 0.0 0.0 0:01.73 watchdog/1
20 root rt 0 0 0 0 S 0.0 0.0 0:00.10 migration/1
21 root 20 0 0 0 0 S 0.0 0.0 0:00.02 ksoftirqd/1
23 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
24 root rt 0 0 0 0 S 0.0 0.0 0:01.56 watchdog/2
25 root rt 0 0 0 0 S 0.0 0.0 0:00.12 migration/2
26 root 20 0 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/2
28 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H
29 root rt 0 0 0 0 S 0.0 0.0 0:01.56 watchdog/3
30 root rt 0 0 0 0 S 0.0 0.0 0:00.11 migration/3
31 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/3
33 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/3:0H
34 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 khelper
35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
36 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns
37 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback
38 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd
39 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset
40 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Johnson,
Thank you for the detailed report on the issue. The Media Server Studio product is intended for server use-case and not optimized for client use-case. Meaning, we intend the samples to be run in headless mode (use DRM) - so that the output is either streamed using UDP packets or written to a file and then decoded using a player.
In your case, you are rendering the output using X11 and we do not recommend this method since it is not optimized for. That is why you are observing such high CPU usage. In short, our recommendation is to run headless using DRM. Apologies for not making this clear at the beginning itself or in documentation. Questions such as these bring to focus the gaps in our documentation communication, and we will improve them for future.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply!
My team wants to use media sdk in our products for local deoding and displaying. The use cases include: decode and display stream data coming from network or decode and display stream data stored in local disks. But our product does local decoding and displaying at the same time does other things like recording etc.
So we want local decoding and displaying at very low cpu usage in order to give more cpu to other business, and at very low latency for
network playback.
Can you get our user scenario? If not, i will give more description.
So, pls help us to get ways to apply media sdk to our user scenario, thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i find vaPutSurface is the source of high cpu usage.
if i comment out the function, the cpu usage will be lower.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Johnson,
The vaPutSurface function renders the frame on the screen, and glad you found that as the bottleneck. Commenting that function will disable drawing the decoded output on the screen. Here are some of my recommendations based on what you want to achieve -
1. Our decoder performance is very competitive, but we have not optimized the X11 interface. This is because our samples and tutorials are meant to be starting points for application development and not meant to be product quality. So, it would be very efficient if you could write an optimized X11 interface of your system that can display the decoded frames.
2. You can use a small circular buffer (or file I/O) to write the decoded frames at 30fps (or your playback rate), and use an external player to read the buffer or file to play. This way, you can avoid writing large files, and also control the rate.
3. UDP locally, and playback using ffplay or other players (VLC).
Hope this helps. If I get more suggestions from my colleagues, will let you know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
Now i want to display the decoded frames by using opengl.
But i have a question: is there a method for transferring decoded frames(always nv12 format for media sdk) to opengl texture in the GPU instead of from GPU to CPU then back to GPU ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello johnson, Happy New Year!
We do not have an example or support to do this easily in Linux yet, although in Windows we support DXVA+OGL surface sharing. We understand the need for OGL for Linux - but given that the server product (for Linux) is usually run headless, we have not prioritized this, but your feedback is welcome and and we will plan to include this in our future releases.
in the meantime, I recommend you look at MMSF framework to achieve what you are looking for - https://software.intel.com/sites/landingpage/mmsf/documentation/index.html. It comes with samples as well for you to get started and playing with. For your use-case, this sample is of relevance - https://software.intel.com/sites/landingpage/mmsf/documentation/mmsf_example1.html. Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello johnson,
I have the same requirement with you ! we want local decoding and displaying at very low cpu usage in order to give more cpu to other business, and at very low latency for network playback, too.
Do you have resolved the problem ? Can you share your experience ? Thank you in advance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello johnson,
I have the same requirement with you ! we want local decoding and displaying at very low cpu usage in order to give more cpu to other business, and at very low latency for network playback, too.
Do you have resolved the problem ? Can you share your experience ? Thank you in advance.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page