Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Intel oneAPI Toolkits
- Intel® oneAPI HPC Toolkit
- Different math results in openMP depended on number of processes

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

toastnmaker

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-28-2008
09:16 AM

183 Views

Different math results in openMP depended on number of processes

Hi,

I've encoutera very strange issue in openMP.

I wrote (with help of tutorials) a simple program which computesPi (and sums up a numbers from0 to N):

#include

#include

#include

#define PI25DT 3.141592653589793238462643

static long nbins = 1<<30;

int main (int argc, char *argv[])

{

long int i;

double x, pi, sum = 0.0, sum2 = 0.0, bin;

if (argc>1)nbins = atoi(argv[1]);

bin = 1.0/(double) nbins;

#pragma omp parallel for reduction(+:sum,sum2) private(x)

for (i=1;i<= nbins; i++)

{

x = (i-0.5)*bin;

sum += 4.0/(1.0+x*x);

sum2 += i;

}

bin * sum;

printf("%10d steps %.15f diff: %.4g
",nbins,pi,fabs(pi-PI25DT));

return 0;

}

It gives surprising different result depending

1) on the method used in omp parallel for pragma (static|dynamic|guided)

2) on the number of threads which are used (set by

set OMP_NUM_THREADS=1|2|3|4)

OMP_NUM_THREADS=2

C: oastpi>pi-omp.exe 1000000

Static (default) explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589916 diff: **1.23e-013**

Dynamic explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589932 diff: **1.39e-013**

Guided explicit schedule:

Int test sum 278328760174.000000

1000000 steps 3.141592653589938 diff: **1.448e-013**

OMP_NUM_THREADS=3

C: oastpi>pi-omp.exe 1000000

Static (default) explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589883 diff: **8.971e-014**)

Dynamic explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589906 diff: **1.132e-013**

Guided explicit schedule:

Int test sum 173337045946.000000

1000000 steps 3.141592653589877 diff: **8.349e-014**

OMP_NUM_THREADS=4

C: oastpi>pi-omp.exe 1000000

Static (default) explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589878 diff: **8.482e-014**

Dynamic explicit schedule:

Int test sum 500000500000.000000

1000000 steps 3.141592653589903 diff: **1.097e-0**13

Guided explicit schedule:

Int test sum 169875681895.000000

1000000 steps 3.141592653589882 diff: **8.882e-014**

Diff gives the difference between comptued and manually defined Pi. For higher number of steps it gives even worse differences. The computer has physically 4 cores. Will someone explain this behaviorto me, please? I had always thought that the parallel for computes over all theiterations and that computer precision error should be the same or at least very similar. I am quite shocked that even a sum of int numbers gives different result...

Why?

Thank you,

qutie confused Martin

Link Copied

1 Reply

Georg_Bisseling

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-21-2008
06:34 AM

182 Views

The effect you observe has nothing to do with threading as such.

If you sum up floating point numbers then the result can depend heavily on the exact sequence of operations. When summing up an array of such diverging numbers it is a common practice to sort the array by ascending magnitude and then sum it up to minimize roundoff errors.

The different scheduling for the reduction in the program of course changes the sequence of operations.

I recommend http://docs.sun.com/source/806-3568/ncg_goldberg.htmlfor a much better explanation than I could give.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.