- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This one surprised us, during a workshop underway at PKU in Beijing. The pi program is a standard teaching intro to OpenMP; you get to use private variables, and a reduction (code included below, with the OpenMP pragma in bold).
Here's the puzzle: when "private(x)" is omitted, no race condition is observed (!). We tried several resolutions (both num_steps and number of threads), but the image stubbornly kept returning the same correct answer every time.
Recompile the bad code (no private variable) with gcc, and the race condition shows up as expected.
Context is compilerl_cproc_p_11.1.072, RHEL 5.4, 32-core NHM.
Is it possible the compiler is doing some auto-magic here, masking the race? Forty-eight students (and their instructor) remain intrigued.
Thanks!
Michael
#include
#include //required for the get_wtime() api
long long num_steps = 100000000;
double step;
int main(int argc, char* argv[])
{
double start, stop;
double x, pi, sum=0.0;
int i;
step = 1./(double)num_steps;
start = omp_get_wtime(); //uses openmp's timer api
#pragma omp parallel for private(x) reduction(+:sum)
for (i=0; i
{
x = (i + .5)*step;
sum = sum + 4.0/(1.+ x*x);
}
pi = sum*step;
stop = omp_get_wtime(); //uses openmp's timer api
printf("The value of PI is %15.12f\\n",pi);
printf("The time to calculate PI was %15.3f seconds\\n",stop - start);
return 0;
}
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's likely that normal optimization will result in each thread getting one or more private register copies of x. Note that such optimization is the default for icc (where -g isn't set), but happens only when specified by a -O option for gcc.
It should be well known that threading errors may be exposed or masked with optimization.
It should be well known that threading errors may be exposed or masked with optimization.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are interested in observing the race condition try
volatile double x, pi, sum=0.0;
NOTE
I suggest you show your students the assembly code expansion of the Serial compilation with and without volatile. This is to show the code without volatile uses registered x and with volatile uses memory located x. Then show parallel with volatile and private(x)using locations for x.
The reason for this exercise is you do not want your students to falsely learn "volatile breaks code".
Your real purpose was to illustrate proper usage of private (and reduction).
A method without using disassembly can be achieved by inserting into the top of the loop
if(x == 0.0)
cout << &x << endl;
This would illustrate
same location for x when NOT using private(x)
different location for x when using private(x)
Note, non-volatile compilation with cout code might move x out of register and back to memory.
Jim Dempsey
volatile double x, pi, sum=0.0;
NOTE
I suggest you show your students the assembly code expansion of the Serial compilation with and without volatile. This is to show the code without volatile uses registered x and with volatile uses memory located x. Then show parallel with volatile and private(x)using locations for x.
The reason for this exercise is you do not want your students to falsely learn "volatile breaks code".
Your real purpose was to illustrate proper usage of private (and reduction).
A method without using disassembly can be achieved by inserting into the top of the loop
if(x == 0.0)
cout << &x << endl;
This would illustrate
same location for x when NOT using private(x)
different location for x when using private(x)
Note, non-volatile compilation with cout code might move x out of register and back to memory.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim -- excellent, thank you! We revisited the pi program in class today, covering the matter as you suggested; the cout trick was good for a quick "aha" (printf debugging lives on...).
cheers,
Michael

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page