- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My CPU is core i5-2430m.I have downloaded the Media SDK(Version 3.5.915.45249) and the drive(9.17.10.2875). I try to run the application "...\Media SDK 2012 R3\samples\_bin\win32\sample_decode.exe" with the below command:
"sample_decode.exe h264 -i 264standard.264 -o yuvtest.yuv -hw"
When the application is running, I find that EU of GPU doesn't work, but MFX can work normally.My version of GPA is 12.5.0.187105.
To my suprise, the EU and MFX of GPU work well when I run the application sample_encode.exe in the sample fold.
My question is why my EU of GPU doesn't work when I use the decode funtion in the sample fold.
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
What you are observing is expected when only decoding. The decode is 'hardware accelerated', but there is no need for using both MFX and EU engines when simply decoding.
A useful reference that discusses the various components of the graphics unit is here:
http://software.intel.com/en-us/articles/using-intel-graphics-performance-analyzer-gpa-to-analyze-intel-media-software-development
-Tony
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,Tony
I spent a whole day to read the pdf you gave me and to use GPA to analyse my program.Now I learned much knowledge about GPA and Quick Sync Video technology.I think Motion Estimation Track and Coding Track are important to improve my application.Thank you very much!
But now I still don't know why the decode sample consumes about 22% of CPU ,however, the encode sample consumes just about 11% of CPU.I want to reduce the consumption of CPU.Can you tell me why the sample_decode.exe in the sample fold consumes so much and what should I do?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
For the usage you are providing, the CPU can be busy doing many things. When you provide an output file to sample_decode, the YUV information is written as large YUV 4:2:0 data. Software tools (like Intel Parallel Studio) can analyze what processor is doing, and you should see it spends time in functions like "mfxStatus CSmplYUVWriter::WriteNextFrame(mfxFrameSurface1 *pSurface)". You can remove the file writing and decode to the screen by using the "-r" option.
-Tony
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,Tony
I just run the code following your suggestion, and CPU works well! But I don't think "mfxStatusCSmplYUVWriter::WriteNextFrame" consume so much CPU, because when I comment those WriteNextFrame functions, CPU works still busy.I think the key problem is that I don't limit the speed of decoding frame! I hope that the docoding speed is approximately equal to the frame rate which is gotten from .264 file header. My question is that can I limit the decoding speed? Now my solution is that add a code "Sleep(40);" before every MFXVideoDECODE_DecodeFrameAsync funtion.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Yes, there are many ways to limit decoding speed. The requests to MediaSDK will exectute upon request, so applicatoin can deside when to request operations.
If the application usage is only for playback, it is common to decode into all available buffers, but the buffer in use for display/presentation is limited to display at the desired playback rate.
-Tony
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,Tony
Now I know what to do, Thank you for your help.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page