Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

OpenMP and Thread Checker question

Tim_Gallagher
New Contributor II
343 Views
Hi,

I've just started using OpenMP in my code and the answers are correct (I guess that's a start!). But, I ran the thread checker on it and I'm a bit confused by the results. I get many instances of thing similar to:

Memory write of centroid at
"structuredHex.f90":542 conflicts with a
prior memory write of
var$2919var$2921_dv_template.addr_a0 at
"wenoAMR.F90":94 (output dependence)

The code at "wenoAMR.F90":94 is inside of a parallel DO. The variable centroid is actually a RESULT of a function called from inside the parallel DO loop. Obviously var$2919var$2921_dv_template.addr_a0 is a compiler created variable.

I'm not sure how to fix it since I don't know what that compiler-created variable is.

A likely related question I have is how could I tell OpenMP that a shared variable is read-only during a certain section and so it doesn't need to be locked? For instance,

[bash]
PROGRAM test
   IMPLICIT NONE

   REAL, DIMENSION(10,2) :: A
   INTEGER :: I, J

   A(:,1) = (/ (I, I=1,10) /)

!$OMP PARALLEL SHARED(A)
   DO J = 1, 10000000
!$OMP DO
   DO I = 1, SIZE(A)-1
      A(I,2) = A(I+1,1)-A(I-1,1)
   END DO
!$OMP END DO
   END DO
!$OMP END PARALLEL
   PRINT *, A(1:9,2)
END PROGRAM test
[/bash]

In the first loop, I could understand the compiler not knowing that accesses to A are independent, but they are. Thread checker says there is a data race condition on the loop, but there isn't really. As a side note, I ran some timing on this simple test and the OpenMP version takes 5.5 times longer than the serial version (compiled with -O0) so there's something strange there. Adding NOWAIT to the END DO cuts the run time in half, but it's still ~3 times slower with 2 threads. Maybe this is too simple of a test case...

Thanks,

Tim
0 Kudos
3 Replies
TimP
Honored Contributor III
343 Views
If your usage of SIZE(A) means anything, it forces out of bounds access, so Thread Checker would be correct in pointing out the race condition. Did you mean SIZE(A,DIM=1) ? An example of the distinction is shown in the ifort docs.
NOWAIT would aggravate the race condition; apparently it means processing of the next value of J can begin before the current J iteration is complete. As you apparently intend each value of J to over-write results from the previous value, whether or not you intended the out-of-bounds access, it's difficult to see what you wish to demonstrate.
0 Kudos
Tim_Gallagher
New Contributor II
343 Views
I suppose that's what happens when I write stuff really late at night... Yes, I meant SIZE(A,DIM=1).

What I'm trying to show is that the updating of A(I,2) depends on I+1 and I-1, which would be a data race condition if it was A(I+1,2). But thread checker reports it as a data race condition anyway, even though each thread can safely update it's A(:,2) section because A(:,1) is read-only.

You can ignore the J loop and the point remains, the J loop was just something I threw in so it would take more time to get a timing run out of it.

An example that Thread Checker reports with no problems is:

[fortran]PROGRAM test
IMPLICIT NONE

REAL, DIMENSION(10) :: A, B
INTEGER :: I, J

B(:) = (/ (I, I=1,10) /)

!$OMP PARALLEL SHARED(A) FIRSTPRIVATE(B) COPYIN(B)
!$OMP DO
DO I = 1, SIZE(A)-1
A(I) = B(I+1)-B(I-1)
END DO
!$OMP END DO NOWAIT
!$OMP END PARALLEL
PRINT *, A(1:9)
END PROGRAM test
[/fortran]
This is functionally the same code (answers will always be the same), but this is data-race free. The problem is making a copy of B in my actual application -- making a copy of the entire initial conditions array for each thread would explode the memory usage to unacceptable levels.

Does that make more sense now that I'm a little less tired? Sorry for the bad example earlier...

Tim
0 Kudos
Tim_Gallagher
New Contributor II
343 Views
There's still an array out of bounds thing since I starts at 1, it should start at 2...

That's what happens when I compile without -C and the code never crashes...

Tim
0 Kudos
Reply