- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi.
I tested with sample_decode (./sample_decode_drm h264 -i intel_hw_test.h264 -o output.yuv -hw -vaapi), that efficiency is very low.
the test data:
- 720p video, decoding 15 frame per second and cpu100% (total cpu400%).
- When I removed the write file operation(m_FileWriter.WriteNextFrame (frame)), decoding 1700 frames per second and cpu40%.
- if I use memcpy operation to replace write file operation, decoding 80 frames per second and cpu100%.
So i guess , copy one frame from video memory to system memory is Performance bottlenecks.
My question is:
- My guess is correct?
- Is there a good way to enhance the efficiency of decoding? My requirement: decoding 200 frame per second and consume less CPU.
thanks.
hang.liu
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi hang.liu,
Thank you for your question.
Your guess is partially correct. Yes copying frame from system memory to video memory take CPU utilization and decreases the performance. But there is one more thing to consider here is color conversion happening from nv12 to yuv, which is not the most efficient way right now in sample_decode and hence considerably would reduce the performance. Just to make it clear, samples doesn't provide complete solutions, they are just starting point.
Depending upon your pipeline, there could be more options by which decoding speed can be increased. Please let us know what is the pipeline you are looking at and the system configuration you are using.
Thanks,
-Surbhi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- 720p video, 80 frames per second is equivalent to 110M/s data, If it's GPU hardware transmission limit of I/O?
- Is there authoritative test reports about "the intel hardware decode and encode performance "?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This topic is being discussed through private message, for the rest of folks who might be interested -
- Use the IOPattern out to the system memory, this way Media SDK will optimize the o/p we write it to system memory. To understand better, please look at the tutorials simple_decode and simple_decode_vmem. The IOPattern is pretty well defined. Link to the tutorials : https://software.intel.com/en-us/intel-media-server-studio-support/training. Once you use IO pattern out to system memory then there is no need to separate mem copy.
- Setting your BIOS on boost setting can also be helpful.
Thanks,
-Surbhi
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page