I'm noticing a rather worrisome bug in the intel openMP implementation in fortran, and wanted to know whether it was already known. (I hope this is the right forum to post it in.)
The bug is quite simply that there is a difference in the computational output between whether I use:
Or whether I use:
and then manually specify all the variables.
The reason why I noticed it was that I originally had default(Shared) in some code, but it kept giving me results which were slightly off from when I ran it without parallelization on, so I started the tedious task of specifying all the shared variables and going through them one by one, to figure out which one that must be wrong and should have been private instead. But in the end I found none that needed to be changed and thus just ended up specifying all the variables as shared. With the result that the small computational difference disappeared.
I then went on to a second parallel region in my code that suffered the same problem and got the same result.
Is this a well known isssue?
It seems like I spoke too early, I did indeed see the behaviour as described above but further testing revealed that the code is sometimes producing the right answer and sometimes wrong answers and it was just an unlucky correlation I observed rather than a causation.
Sorry for the false alarm.
>> further testing revealed that the code is sometimes producing the right answer and sometimes wrong answers
Have you considered that both answers may be correct (as well as both answers wrong).
Parallel code, producing same answers as serial code is not confirmation that the answers are correct. It simply means "consistent with" serial code.
Also, bare in mind that operations producing rounding errors often produce slightly different results dependent upon the order in which the errors are accumulated. For example, having an array with values containing round off errors (different values - different round off errors), will typically produce different values, even in serial code, when you produce the summation reading the array from low-index to high-index, as compared to reading the array from high index to low index. IOW order can cause deviations in the accumulation of errors. As you add additional threads, the order of accumulations change, therefore the results may change (and may additionally change dependent on thread completion order).