- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
why?
code1
#include "stdafx.h"
#include "omp.h"
#define N 100000
int _tmain(int argc, _TCHAR* argv[])
{
int arx,ary;
int i,max_num_x=-1,max_num_y=-1;
for(i=0;i {
arx=i;
ary=N-i;
}
omp_set_num_threads(10);
#pragma omp parallel for
for(i=0;i {
//#pragma omp critical(max_arx)
if(arx>max_num_x)
max_num_x=arx;
//#pragma omp critical(max_ary)
if(ary>max_num_y)
max_num_y=ary;
}
printf("max_num_x=%d max_num_y=%d\n",max_num_x,max_num_y);
return 0;
}
and
code2
#include "stdafx.h"
#include "omp.h"
#define N 100000
int _tmain(int argc, _TCHAR* argv[])
{
int arx,ary;
int i,max_num_x=-1,max_num_y=-1;
for(i=0;i {
arx=i;
ary=N-i;
}
omp_set_num_threads(10);
#pragma omp parallel for
for(i=0;i {
#pragma omp critical(max_arx)
if(arx>max_num_x)
max_num_x=arx;
#pragma omp critical(max_ary)
if(ary>max_num_y)
max_num_y=ary;
}
printf("max_num_x=%d max_num_y=%d\n",max_num_x,max_num_y);
return 0;
}
please tell me why the results of the two codes are identical? I don't know why no add #pragma omp critical ,no data race too,in code1.
code1
#include "stdafx.h"
#include "omp.h"
#define N 100000
int _tmain(int argc, _TCHAR* argv[])
{
int arx
int i,max_num_x=-1,max_num_y=-1;
for(i=0;i
arx=i;
ary=N-i;
}
omp_set_num_threads(10);
#pragma omp parallel for
for(i=0;i
//#pragma omp critical(max_arx)
if(arx>max_num_x)
max_num_x=arx;
//#pragma omp critical(max_ary)
if(ary>max_num_y)
max_num_y=ary;
}
printf("max_num_x=%d max_num_y=%d\n",max_num_x,max_num_y);
return 0;
}
and
code2
#include "stdafx.h"
#include "omp.h"
#define N 100000
int _tmain(int argc, _TCHAR* argv[])
{
int arx
int i,max_num_x=-1,max_num_y=-1;
for(i=0;i
arx=i;
ary=N-i;
}
omp_set_num_threads(10);
#pragma omp parallel for
for(i=0;i
#pragma omp critical(max_arx)
if(arx>max_num_x)
max_num_x=arx;
#pragma omp critical(max_ary)
if(ary>max_num_y)
max_num_y=ary;
}
printf("max_num_x=%d max_num_y=%d\n",max_num_x,max_num_y);
return 0;
}
please tell me why the results of the two codes are identical? I don't know why no add #pragma omp critical ,no data race too,in code1.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is possible that your compiler may choose atomic operations, even though you don't specify them, as ICL would do when you allow vectorization, or may optimize the loops away, as gcc would do. I am assuming there is no special implication to the use of a Microsoft C-like language, other than that you exclude the use of a standard compiler.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Asside from the issue that unless your system has more than 10 cores (hardware threads), you shouldn'trequest more threads than are available.
The parallel loop will divide up the range into number of threads chunks, in this case 10. The 1st thread into the loop gets 0:N/10, 2nd N/10+1:(N/10)*2, ....
The moment the 1st thread finds any element in ary, and inserts its max value, then all other threads (actually all threads in this case) will never find any other max for ary.
The moment the last thread finds the 1st element in its subsection for arx it will be a new max, then all other threads will never find any other max for arx. From then on, only the last thread will find a new max for arx on each subsequent iteration.
Therefore, only if one of your threads gets evicted (preempted) after finding a local max, but before setting the found max value, and if the eviction lasts longer than the run time for either the 1st or last thread as the case may be, will you then observe the incorrect result.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Mr. Jim Dempsey:
Thank you very much for your reply.
Thank you very much for your reply.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is possible that your compiler may choose atomic operations, even though you don't specify them, as ICL would do when you allow vectorization, or may optimize the loops away, as gcc would do. I am assuming there is no special implication to the use of a Microsoft C-like language, other than that you exclude the use of a standard compiler.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page