Community
cancel
Showing results for 
Search instead for 
Did you mean: 
____2
Beginner
152 Views

Copy speed between video memory and system memory

Hi,

      I have done a test for copy speed between video memory and system memory, Following is result:

   

FPS

       MB/S

       GB/S

720p surface

from system memory to video meory

1100

      1450

       1.4

720p surface

from video memory to system memory

      161

      212

       0.2

1080p surface

from system memory to video meory

      420

      1245

       1.2

1080p surface

from video memory to system memory

      72

      213

       0.2

Now, My question is: Why is the copying speed from video memory to system so slow?

thanks!

0 Kudos
6 Replies
Shaojuan_Z_Intel
Employee
152 Views

Hi there,

Could you provide more information about your test environment? For example, what kind of a system were the tests run on? What is the operating system? Did you use any MediaSDK samples to measure the result? How are those data generated? Did you use any third party tool in the experiment? It is hard to explain just looking at those numbers. Look forward to more information. Thank you!

____2
Beginner
152 Views

Test Enviroment::  I7-4558U

Test Method: memcpy between system memory and video memory(surface created by D3D9 function)

Test result:   Copy from system memory to video memory: 1450MB/s

                    Copy from video memory to system memory:  212MB/s

 

Shaojuan_Z_Intel
Employee
152 Views

Hi,

Different tools may have different mechanism handling copy between system and video memory. Can you explain more about your application? Are you using Media SDK? Inside Media SDK, the copy between system and video memory is handled internally and can be asymmetric. In the latest Media SDK, there is one parameter called GPUCopy inside mfxInitParam, which can enable/disable GPU accelerated copying between video and system memory. Did you compare your system with GPUCopy turned on and off? Would like to know more details about your application. Thanks!

____2
Beginner
152 Views

Shaojuan Z. (Intel) wrote:

Hi,

Different tools may have different mechanism handling copy between system and video memory. Can you explain more about your application? Are you using Media SDK? Inside Media SDK, the copy between system and video memory is handled internally and can be asymmetric. In the latest Media SDK, there is one parameter called GPUCopy inside mfxInitParam, which can enable/disable GPU accelerated copying between video and system memory. Did you compare your system with GPUCopy turned on and off? Would like to know more details about your application. Thanks!

 


Well, I'm trying to explain more about it.

First I created IDirect3DSurface9 surfaces with method

"IDirectXVideoAccelerationService IDirectXVideoAccelerationService:: CreateSurface (defined in D3D9 dxva2api.h)"

to create d3d surfaces, decoded some video data into these surfaces, created a D3DLOCKED_RECT(defined in <d3d9types.h >) object, and locked this object with method
 
"virtual HRESULT IDirect3DSurface9::LockRect(D3DLOCKED_RECT * pLockedRect, CONST RECT * pRect, DWORD Flags)".

I could successfully get a pointer _D3DLOCKED_RECT::pBits pointing at the region where video data was lying at this point.

I believe that this region would belong to a piece of video memory(Have I misunderstood what I keeps believing here?) .

With this pointer, I copied data using function "memcpy()" between system memory and video memory with no difficulties. That's what I did.

Hoping for a explaination of the data listed above.

Thank you.

Shaojuan_Z_Intel
Employee
152 Views

Hi there,

It is hard to tell exactly what your test application doing, but there are many reasons why you might see slow behavior using (only) memcpy to transfer data from video memory to system memory. To transfer video/frame data to system memory we recommend that the Media SDK API (with SYSTEM_MEMORY flags) be used, as it contains optimized solutions that are much faster than the basic 'memcpy' operation. Thanks!

152 Views

Hi Chen - The question you posted does not related to Media SDK, but more for the DirectX forum I believe. Its an interesting question nevertheless. From reading your question, may I ask if you ensured same experiment setup (zero-caching effects for instance) when computing the bandwidth? It seems like you initially have the data decoded onto video surface, then memcpy to system, then memcpy it to video. The first transfer can be slower than the second due to caching I'd think. Meantime, it also depends on the status of memory (on system and video side) - is your copy resulting in lot of thrashing on one side compared to other, or is one side hitting capacity as compared to other etc.,

I understand this is probably not what you're looking for, but again, your question is more relevant to the DirectX forum than here. For MediaSDK, the SDK handle system<->video copying internally. Hope this helps.

Reply