- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You are right, mkl_?imatcopy is not optimized for non-square cases since even optimized case would much slower than out-of-place transposition. So in general, in such situations we usually either use mkl_?omatcopy or use gather-operation-scatter technique if it is suitable for algorithm (e.g. copy some block of data to the temporary buffer, perform needed operations and scatter the data back to its place -- this technique allows to reuse data in cache and generally improve the performance).
Square case is well optimized, since it is the case when mkl_?imatcopy can really help.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You are right, mkl_?imatcopy is not optimized for non-square cases since even optimized case would much slower than out-of-place transposition. So in general, in such situations we usually either use mkl_?omatcopy or use gather-operation-scatter technique if it is suitable for algorithm (e.g. copy some block of data to the temporary buffer, perform needed operations and scatter the data back to its place -- this technique allows to reuse data in cache and generally improve the performance).
Square case is well optimized, since it is the case when mkl_?imatcopy can really help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the quick answer :)
I will switch to copy/omatcopy for now. imatcopy has really impressive performances for square matrices.
Even if not fully optimized for the rectangular case, I would have expected better performance than my naive algorithm.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page