Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Simultaneous access to arrays

jan_barnholt
Beginner
494 Views
Thread Checker reports several conflicts when our application
accesses arrays simultaneously from several threads.

I couldn't find a note in the documentation whether the VTune TC
makes a distinction for the array elements accessed or if any
concurrent access to the array (regardless what index) is reported
as a conflict.

In other words, if thread 1 always accesses the array at
index 1 and thread 2 concurrently accesses index 2, will this
arouse a conflict ?

regards,
J. Barnholt
0 Kudos
4 Replies
TimP
Honored Contributor III
494 Views
I wouldn't be surprised if a potential conflict were always reported for simultaneous access to the same array. I can't imagine a useful case simple enough for easy diagnosis. Even if there is no actual conflict, but false sharing is likely, the warning could be helpful.
I also would be interested in a more authoritative answer.
0 Kudos
TimP
Honored Contributor III
494 Views
I suppose this question might be appropriate for the threading forum.
0 Kudos
Henry_G_Intel
Employee
493 Views
Hello,
The Intel Thread Checker will only report a storage conflict if threads can simultaneously access the same array elements. In the following simple example, all threads access the same array, but never the same elements so Thread Checker will not report storage conflicts or errors:
#pragma omp parallel for
for (i = 0; i < N; i++)
a = 0;
However, Thread Checker will report errors for the following code because the threads can access the same elements of the array:
#pragma omp parallel for
for (i = 0; i < N; i++)
for (j = 0; j < N; i++)
a = 0;
Best regards,
Henry
0 Kudos
Henry_G_Intel
Employee
493 Views
Hi Tim,
Good point about false sharing but it's a performance issue rather than a correctness issue. The following code will likely exhibit false sharing because the shared array is small enough to fit ina single cache line:
#define THREADS 4
double sum = 0.0, sum_local[THREADS];
#pragma omp parallel
{
int id = omp_get_thread_num ();
sum_local[id] = 0.0;
#pragma omp for
for (i = 0; i < N; i++)
sum_local[id] += x * y;
#pragma omp atomic
sum += sum_local[id];
}
Performance may suffer because the threads repeatedly invalid each other's cache, but the parallel code is correct so Thread Checker will not report any errors. VTune is the tool to use to find false sharing.
Best regards,
Henry
0 Kudos
Reply