- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thx a lot, I'll try it.
By the way, I used -O3 -xN for P4 optimisation, can I found some more
optimazed (even aggressive) options?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Depending on your application, options like -O3, -ip, -ipo may or may not be beneficial. Those can get aggressive enough to lead to multi-hour compiles.
You should look at the loop level directives if you're interested in vectorization. 8.0 compilers changed recently, so that arrays are set up with 16-byte alignment when that is possible. This gave more opportunities to use the VECTOR ALIGNED directive, except that it may have been disabled in 8.0.046. That's aggressive, your code breaks, if the data aren't aligned as you said they would be.
If your loop alternates writingamong more than 4 cache lines (for Northwood, HT inactive), and this is not taken care of by partial loop vectorization splitting, the !DIR$ DISTRIBUTE POINT directives may be effective to ask the compiler to split your loop at the point where you put the directive.
![](/skins/images/D3C0B914909A6564BBB97F4AD1ED1973/responsive_peak/images/icon_anonymous_message.png)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page