Community
cancel
Showing results for 
Search instead for 
Did you mean: 
drfrank
Beginner
95 Views

Copying Accelerated Video Decode Frame Buffers, by intel employee "Tom Craver"

"I belong to FFDShow Tryout development team and we are trying to reproduce your work.
For your information we imported the MPC-HC DXVA implementation into our project recently

The goal is to decode the frames with DXVA 1 & 2 and then copy back the frames into system memory to process them and then write them back.

But we have a problem : we don't get the same speed results as yours (I have a Q9450 with a radeon 5750 in PciExpress 16)
With memcpy or the SSE4.1 optimized copy method, it takes 80ms to copy 1 frame

Do you have an idea about what is wrong ?"

"Either we are doing something wrong (but I begin to doubt it), or else the sense GPU=>CPU gives by designed slow transfers

I hope that one will be able to get in touch with the intel's guy who wrote this article (but I guess that he only tried with low res videos)
...Also note that we are talking about reading (GPU=>CPU), writing is very fast though."

http://software.intel.com/en-us/articles/copying-accelerated-video-decode-frame-buffers/

http://forum.doom9.org/showthread.php?p=1367471#post1367471

http://forum.doom9.org/showthread.php?p=1367653#post1367653

0 Kudos
2 Replies
95 Views

Will check with Tom to see if he can comment on this thread. Will keep you posted.

Thomas_C_Intel1
Employee
95 Views

drfrank:

I apologize for not responding sooner - somehow I missed seeing notification of your post.

My own testing involved copying hardware decoded high def video frames back to conventional Read/Write memory from Intel integrated graphics USWC memory on an Intel Core i5 processor.

I don't believe the MOVNTDQA instruction provides much if any benefit for discrete graphics processors with USWC memory mapped over PCI express. The benefits mainly apply to system memory mapped as USWC, including system memory mapped for integrated graphics use.

-Tom Craver
Reply