- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Collegues;
I have a serial Fortran code that works fine. Once I compile the same code using ifort -parallel and run it, it gives wrong results and overflow. I would expect that with "-parallel" flag, the Intel compiler is capable of selecting the loops that are safe to parallelize and I should get the exact same results as for the serial code, which did not happen. The even more strange behaviour is that I went ahead and closed all the do loops parallelization in my code using !DEC$ NOPARALLEL, compiled the code using ifort -parallel to make sure that non of the loops was parallelized and then run. Surprisingly, I got the same wrong results and overflow, although the latter action should be exactly equivalent to a serial code.
Is there any one capable of explaining this behaviour or is it just an Intel compiler deficiency.
Greetings.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks to be a deficiency in the Intel compiler. A basic rule is that if the serial code is running fine, then using -parallel should yield the same results atless time.
Any Intel developer to pick up this point and discussion ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Given what you described (using !DEC NOPARALLEL), what Tim suggests is plausible and is seen in many first attempts at using auto-parallelization, as is that -parallel (which requires higher-level opts of -O2 or -O3) may have produced incorrect code causing the incorrect results.
So as Tim suggests:
Can you compile with -auto (w/o -parallel) to see if there are any side-effects?
If you compile with -check all and execute, are there any run-time diagnostics reported?
Since you did not indicate your compiler version, if you are not using the most recent 11.1 Update 2 (11.1.067 - Mac OS, 11.1.056 - Linux) it may be worth obtaining and trying. If that does not help then we would appreciate obtaining a reproducing test case if possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend reading this article on "Why doesn't my application always give the same answer"
http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/
This discusses optimization but many of the same arguments hold true for parallel optimizations. As a short example, the order of calculations performed in a reduction is quite different between a serial and parallel case.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page