- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
[cpp]According to Intel's thread checker, the attached Fortran+OpenMP code produces a write->write data-race. Can anyone say why this is happening? I don't understand why
the race condition is only produced in the v-variable in the subroutine and not in the other cases. Is this a problem with
the code, the compiler or the thread checker???
Fabian
program iomp_itt implicit none type tvector real(kind=8), dimension(:), pointer :: u,v end type tvector type tgrid type(tvector) :: t end type tgrid type(tgrid) :: grid integer(kind=4) :: i allocate(grid%t%u(1:26)) allocate(grid%t%v(1:26)) !$omp parallel do shared(grid) do i=1,26 ! no data race grid%t%u(i) = 0.0_8 ! no data race grid%t%v(i) = 0.0_8 end do !$omp end parallel do call test(grid) deallocate(grid%t%u) deallocate(grid%t%v) contains subroutine test(g) type(tgrid) :: g integer(kind=4) :: i !$omp parallel do shared(g) do i=1,26 ! no data race g%t%u(i) = 0.0_8 ! the following line produces a write->write data-race g%t%v(i) = 0.0_8 end do !$omp end parallel do end subroutine test end program iomp_itt [/cpp]
- Balises:
- Intel® Fortran Compiler
Lien copié
3 Réponses
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Asside from the fact that your loop and work contained are too small to take advantage of parallization, one would expect thatboth loops would exhibit the same behavior. In the first loop grid is a local array, in the second g is a passed descriptor of grid (which generates a copy of the descriptor of grid). This may be a case of the unLuck of the draw.
As an experiment, make the iteration space (26)such that when divided by the number of threads is a multiple of 8 (8 real(kind=8) fit in a cache line).
2 threads 32
3 threads 48
4 threads 32
5 threads 40
6 threads 48
7 threads 56
8 threads 64
This will give you the smallest number of iterations yet when divided up amongst the threads, each thread is isolated within cache lines.
Your test program might be of interest to Intel.
Jim Dempsey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
The problem doesn't seem to depend on the length on the array. However, I figured out that Intel's thread checker is only complaining about the data race when the program was compiled with -tcheck
Just compiling without -tcheck is also not an option because then the thread checker complains about data races when allocating a private array inside a parallel region. It's really a mess with OpenMP... :-(
Just compiling without -tcheck is also not an option because then the thread checker complains about data races when allocating a private array inside a parallel region. It's really a mess with OpenMP... :-(
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
I've seen a few anomalies with -tcheck which have been corrected in recent compilers. Going even further back, in case you are using a very old compiler, at one time -tcheck didn't set the other options it requires, such as debug symbols.
I've also received hints that major enhancements to support Parallel Studio have put some of the work on Fortran tcheck on hold.
I've also received hints that major enhancements to support Parallel Studio have put some of the work on Fortran tcheck on hold.

Répondre
Options du sujet
- S'abonner au fil RSS
- Marquer le sujet comme nouveau
- Marquer le sujet comme lu
- Placer ce Sujet en tête de liste pour l'utilisateur actuel
- Marquer
- S'abonner
- Page imprimable