Analyzers
Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4699 Discussions

Accuracy of parallel vs. sequential computations

Brynne_N_
Beginner
285 Views

I have a finite element code in which I am trying to parallelize a subroutine. The difference in the results between parallel and sequential computation is around 1E-7. I read in the StackOverflow post linked below that floating point operations are not commutative, so one should not expect identical results when performing calculations in multithreaded codes. How large can this type of error become? After several thousand time steps, would an error of 1E-7 be understandable?

Another issue we have considered is the precision of different threads. Are all the threads in a given computer guaranteed to have the same precision? Or could differing precision be contributing to the difference in the results? 

Thank you for any information you can provide.

StackOverflow: https://stackoverflow.com/questions/13937328/division-of-floating-point-numbers-on-gpu-different-fro...

0 Kudos
1 Solution
TimP
Black Belt
285 Views
It would be unusual to see more overall precision than that in a finite element code compiled in single precision. The usual reasons for differing roundoff among threads would be use of parallel reduction. Some applications use a build option to avoid reduction so as to isolate those.

View solution in original post

2 Replies
TimP
Black Belt
286 Views
It would be unusual to see more overall precision than that in a finite element code compiled in single precision. The usual reasons for differing roundoff among threads would be use of parallel reduction. Some applications use a build option to avoid reduction so as to isolate those.
Brynne_N_
Beginner
285 Views

Tim P. wrote:

It would be unusual to see more overall precision than that in a finite element code compiled in single precision. The usual reasons for differing roundoff among threads would be use of parallel reduction. Some applications use a build option to avoid reduction so as to isolate those.

 

Thank you for your reply, this makes sense as we are using parallel reduction.

Reply