- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a numerically intensive application which would appear to benefit from parallelisation through OpenMP. At a couple of key loops I have applied the appropriate DO PARALLEL directives and after a bit of trouble shooting everything seems to be working nicely and the resulting application is noticeably faster. Wonderful stuff.
However, I am seeing an ever so slight difference (fifth significant figure after a thousand odd iterations) in the calculated results between a version compiled with OpenMP active, and a straight optimised release version. This difference is present even when the OpenMP version is restricted to one thread (eg via the OMP_NUM_THREADS environment variable). It is very conceivable that the slight difference is simply due to differences in the evaluation order of expressions/rounding/etc, but I need to check in case there's something else astray.
The only difference in compiler options is the /Qopenmp switch, ie:
/QaxP /Qopenmp /real_size:64 /fpe:0 /libs:static /threads
versus
/QaxP /real_size:64 /fpe:0 /libs:static /threads
Can the introduction of OpenMP change the order of evaluation of expressions *within* a parallelised DO loop, or do you just get exactly the same sequence of instructions as per the non-omp case but they run side by side? All this using 10.1.024.
Thanks for any input,
IanH
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
While it is not likely to be the cause of the differences you observe, /QaxP specifies generation of 2 code paths with different numerical properties, chosen according to the CPU on which it is run, which seems contradictory to your desire to get nearly identical results.
/QxP /Qopenmp /real_size:64 /assume:protect_parens,minus0 /Qprec-div /Qprec-sqrt affords fewer opportunities for unexpected numerical differences. I'm somewhat concerned about the implication of /fpe:0 when combined with OpenMP.
The option /Qauto (implied by /Qopenmp) would place local arrays on the stack.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
tim18:
I'm somewhat concerned about the implication of /fpe:0 when combined with OpenMP.
Could you elaborate? I've selected that simply to make things explode when a dodgy floating point operation occurs.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page