Community
cancel
Showing results for 
Search instead for 
Did you mean: 
____2
Beginner
167 Views

How to improve performance of copy YUV from GPU memory to system memory

Jump to solution

Hi, I'm working on Haswell CPU i5, And when I copy 720p YUV data from GPU memory to CPU Memory, I just got  80 frame per second. 

Is the speed normal? and How can I improve the performance of copying?

thanks!

0 Kudos
1 Solution
167 Views

Hello there - Can you kindly give more details on what you are doing?

If you are using Media SDK, we allow using system memory (on CPU) or video memory (on GPU) or opaque (handled by the SDK). If you use system memory, the performance will not be as good as the video memory (copying affects the perf), but the SDK handles copying between system and video memory internally and the developer does not have to worry about that.

You can look at the tutorials to understand now the SDK deals with system and video memory - https://software.intel.com/en-us/intel-media-server-studio-support/training

Article talking about MSDK framework (touches on surfaces too) - https://software.intel.com/en-us/articles/framework-for-developing-applications-using-media-sdk#Allocate Surfaces

View solution in original post

5 Replies
168 Views

Hello there - Can you kindly give more details on what you are doing?

If you are using Media SDK, we allow using system memory (on CPU) or video memory (on GPU) or opaque (handled by the SDK). If you use system memory, the performance will not be as good as the video memory (copying affects the perf), but the SDK handles copying between system and video memory internally and the developer does not have to worry about that.

You can look at the tutorials to understand now the SDK deals with system and video memory - https://software.intel.com/en-us/intel-media-server-studio-support/training

Article talking about MSDK framework (touches on surfaces too) - https://software.intel.com/en-us/articles/framework-for-developing-applications-using-media-sdk#Allocate Surfaces

View solution in original post

____2
Beginner
167 Views

SRAVANTHI K. (Intel) wrote:

Hello there - Can you kindly give more details on what you are doing?

If you are using Media SDK, we allow using system memory (on CPU) or video memory (on GPU) or opaque (handled by the SDK). If you use system memory, the performance will not be as good as the video memory (copying affects the perf), but the SDK handles copying between system and video memory internally and the developer does not have to worry about that.

You can look at the tutorials to understand now the SDK deals with system and video memory - https://software.intel.com/en-us/intel-media-server-studio-support/training

Article talking about MSDK framework (touches on surfaces too) - https://software.intel.com/en-us/articles/framework-for-developing-appli... Surfaces

I just use the system memory surfaces  in the decode tutorial. I did it like this:

 m_pMFXAllocator->LockFrame(frame->Data.MemId, &frame->Data);
   memcpy(m_gpu_Y, frame->Data.Y, frame->Info.Height * frame->Data.Pitch);
   memcpy(m_gpu_UV, frame->Data.UV, frame->Info.Height * frame->Data.Pitch / 2);
   mfxU16 pitch = frame->Data.Pitch;
   m_pMFXAllocator->UnlockFrame(frame->Data.MemId, &frame->Data);

 

 How can I quicken the copying speed?

167 Views

Hello there - When you use system memory for MSDK APIs, the copying is automatically handled by the SDK. The developer does not have to concern with that. For best performance though, we recommend the developers use video memory instead of the system memory (this completely eliminates the copy) and gives the best performance.

Let me know if you need more details.

____2
Beginner
167 Views

SRAVANTHI K. (Intel) wrote:

Hello there - When you use system memory for MSDK APIs, the copying is automatically handled by the SDK. The developer does not have to concern with that. For best performance though, we recommend the developers use video memory instead of the system memory (this completely eliminates the copy) and gives the best performance.

Let me know if you need more details.

Thanks for your reply!

sorry that my past wording got wrong. In fact I just copied the data from video memory.

So what i wonder is what the upper bound of speed copying data from video memory to my memory is. I got only 60-80 frames (720 * 1280 YUV video frames) per sec. in this way. Obvirously it's not a best performance. How to make it faster?

 

167 Views

If I understand your question correctly, you want to understand the performance when "manually" copying data from video to system memory. That number is surely dependent on the system config you have (bandwidth, memory etc.,). 

The simplest way to test the upper bound is to try our tutorials: We have tutorials that compare system memory usage versus video memory. So, using the former, you can get the upper bound on performance one could achieve. You can find the tutorials here - https://software.intel.com/en-us/intel-media-server-studio-support/training

If your question is on how to efficiently copy data from GPU to CPU memory, that is beyond the scope of Media SDK. 

Reply