- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. IFort vectorizes only inner loops.
2. IFort does not vectorize low trip loops (2 steps)
Can I conclude that small (2x2) matrix operations (sum, multiply) will not be auto-vectorized althoug there are sse instructions to perform similar operations ?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(1) This is generally true, although transformations such as loop interchanging and collapsing can involve outer loops too. Conversely, loop materialization enables vectorization of straightline code.
(2) For double-precision data, loops with just two iterations can still be vectorized, although the compiler may, by default, reject such short running loops for efficiency reasons, like misaligned or too much setup/cleanup overhead. Use the vectorization diagnostics (-vec_report2) to inspect the reason why loop are not vectorized, and experiment with pragmas and other compiler hints to see if efficient vectorization is possible. Perhaps you can post an example?
Hope this helps.
Aart Bik
http://www.aartbik.com/
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page