- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is the situation different for the Intel Compiler?
many thanks!
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/QxN does, I am told, enable some additional optimizations over /QxW, but not enough to be worth discarding non-Intel processor compatibility.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Forall,
Note the difference between vectorization for the multimedia extensions SSE/SSE2/SSE3 (enabled by default with switches /QxKWNP) and parallelization for e.g. dual cores and/or the HT technology (enabled with switches /Qparallel [implicit parallelism] or /Qopenmp [explicit parallelism]). I initially recommended the former, and then answered your question on the latter assuming this is what you referred to.
First, I would just see how well automatic vectorization works for your applications (viz. vectorizing an EXP function has great performance potential when occurring in a hot spot!). If you are interested in more background on vectorization for multimedia extensions (including programming guidelines on how to make the compiler more effective in extracting vector instructions from your code), see the Software Vectorization Handbook at http://www.intel.com/intelpress/sum_vmmx.htm (using the C programming language for most examples, but similar concepts apply to Fortran).
Aart
PS. I hope we are notoverwhelming you withtoo muchinformation

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »