- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A few months ago I posted about issues with producing a matrix transposition code that works well on Xeon Phi. Since then, I did more homework and improved the code to yield a satisfactory 113 GB/s transposition rate on 7110P (67% of the STREAM copy bandwidth). The link to the white paper about it is: http://research.colfaxinternational.com/post/2013/08/12/Trans-7110.aspx The paper contains a discussion of the method, code snippets, compiler flags, benchmarks and a comparison with MKL; the source code is publicly available.
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Andrey.
For those who want to see Andrey's original post, it is at: http://software.intel.com/en-us/forums/topic/391162.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page