Community
cancel
Showing results for 
Search instead for 
Did you mean: 
jan_barnholt
Beginner
56 Views

Simultaneous access to arrays

Thread Checker reports several conflicts when our application
accesses arrays simultaneously from several threads.

I couldn't find a note in the documentation whether the VTune TC
makes a distinction for the array elements accessed or if any
concurrent access to the array (regardless what index) is reported
as a conflict.

In other words, if thread 1 always accesses the array at
index 1 and thread 2 concurrently accesses index 2, will this
arouse a conflict ?

regards,
J. Barnholt
0 Kudos
4 Replies
TimP
Black Belt
56 Views

I wouldn't be surprised if a potential conflict were always reported for simultaneous access to the same array. I can't imagine a useful case simple enough for easy diagnosis. Even if there is no actual conflict, but false sharing is likely, the warning could be helpful.
I also would be interested in a more authoritative answer.
TimP
Black Belt
56 Views

I suppose this question might be appropriate for the threading forum.
Henry_G_Intel
Employee
56 Views

Hello,
The Intel Thread Checker will only report a storage conflict if threads can simultaneously access the same array elements. In the following simple example, all threads access the same array, but never the same elements so Thread Checker will not report storage conflicts or errors:
#pragma omp parallel for
for (i = 0; i < N; i++)
a = 0;
However, Thread Checker will report errors for the following code because the threads can access the same elements of the array:
#pragma omp parallel for
for (i = 0; i < N; i++)
for (j = 0; j < N; i++)
a = 0;
Best regards,
Henry
Henry_G_Intel
Employee
56 Views

Hi Tim,
Good point about false sharing but it's a performance issue rather than a correctness issue. The following code will likely exhibit false sharing because the shared array is small enough to fit ina single cache line:
#define THREADS 4
double sum = 0.0, sum_local[THREADS];
#pragma omp parallel
{
int id = omp_get_thread_num ();
sum_local[id] = 0.0;
#pragma omp for
for (i = 0; i < N; i++)
sum_local[id] += x * y;
#pragma omp atomic
sum += sum_local[id];
}
Performance may suffer because the threads repeatedly invalid each other's cache, but the parallel code is correct so Thread Checker will not report any errors. VTune is the tool to use to find false sharing.
Best regards,
Henry
Reply