- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I need to perform a reduction on an array that is inside a named COMMON block.
When the reduction is in the same unit as the one where the parallel reduction is declared, everything goes fine.
When the reduction has to occur in subroutines that are called from the parallel region, then the result is not predictable.
The problem is rooted in the fact that a reduction induces a per thread array, but when the program reaches a subroutine it sees the global array only. If the array is declared threadprivate in the subroutine, then the output is not gathered.
If I mix-up threadprivate/reduction, then I have to use 2 different names for the common, which I would like to avoid.
Is there a design pattern or an OpenMP instruction to make as though the subroutines were inlined ?
It can be funny to look at the provided code in redko.f90, and play with the subroutine being in the same file or not as the main program.
Of course, I would be very grateful to any help, since I dearly need to overcome that tricky problem.
Oliver.
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I took a closer look at making the common block 'reduction' array threadprivate:
subroutine setvalue(i,j)
IMPLICIT NONE
integer :: redarray(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
integer :: i,j
redarray(j) = redarray(j) + i
end subroutine setvalue
That works fine, you just need to add up the partial sums after the OpenMP loop calling the subroutine (similar to what Jim suggested). Something like:
integer :: redarray(2), redarray_copy(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
....
redarray = 0
!$OMP PARALLEL COPYIN(redarray)
!$OMP DO
do i=1, n
call setvalue(i,indir(i))
enddo
!$OMP END DO
!$OMP SINGLE
j = omp_get_num_threads()
redarray_copy = 0
!$OMP END SINGLE
!$OMP DO REDUCTION(+:redarray_copy)
do i=1,j
do k=1,2
redarray_copy(k) = redarray_copy(k) + redarray(k)
enddo
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
Patrick
コピーされたリンク
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
If redarray was not a member of a common block, then you could pass it in the argument list of the subroutine, and the compiler would then know that it's a reduction variable, and the race conditional wouldelminated.
So, make a copy of redarray, pass the copy in the subroutine call argument list, and then update the common block array after the parallel region:
redarray_copy = 0
!$OMP PARALLEL
!$OMP DO REDUCTION(+:redarray_copy)
do i=1, n
call setvalue(i,indir(i),redarray_copy)
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
The subroutine would then become
subroutine setvalue(i,j,subarray)
IMPLICIT NONE
integer, intent(inout) :: subarray(2)
integer :: i,j
subarray(j) = subarray(j) + i
end subroutine setvalue
This implementation is kinda ugly, but correct (race-free).
Patrick
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I wished there were some other means to tell setvalue that it handles a reduction array...
The closest thing to what I would like in terms of minimal code change looks like the threadprivate/reduction stuff as in reduc.f90/setval.f90 below, but I found no way to use only one common block. In the real code, setval may be in parallel or non parallel context, as being called from different code sections.
Oliver.
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Patrick
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
After a short look at your code, you really do not have what constitutes a "reduction array".
The fly in your ointment is your indir(n) array containing 1,2,1,2,1,2,...
Resulting in a parallel loop, using x threads, xnot equal2, stomping on each others redarray(indir(i))
Consider something along the line of:
!$OMP PARALLEL
numThreads = omp_get_num_threads()
myThread = omp_get_thread_num()
if(myThread .lt. n-1) then
do i=1, n
if(mod(indir(i), numThreads) .eq. myThread) then
redarray(indir(i)) = redarray(indir(i))+ 1
endif
end do
endif
!$OMP END PARALLEL
Jim Dempsey
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I took a closer look at making the common block 'reduction' array threadprivate:
subroutine setvalue(i,j)
IMPLICIT NONE
integer :: redarray(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
integer :: i,j
redarray(j) = redarray(j) + i
end subroutine setvalue
That works fine, you just need to add up the partial sums after the OpenMP loop calling the subroutine (similar to what Jim suggested). Something like:
integer :: redarray(2), redarray_copy(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
....
redarray = 0
!$OMP PARALLEL COPYIN(redarray)
!$OMP DO
do i=1, n
call setvalue(i,indir(i))
enddo
!$OMP END DO
!$OMP SINGLE
j = omp_get_num_threads()
redarray_copy = 0
!$OMP END SINGLE
!$OMP DO REDUCTION(+:redarray_copy)
do i=1,j
do k=1,2
redarray_copy(k) = redarray_copy(k) + redarray(k)
enddo
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
Patrick
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
To initialize, do you reckon a COPYIN or a mere:
!$OMP DO
do i=1,j
redarray = 0
enddo
To finalize, do you reckon a REDUCTION or a
!$OMP DO
do i=1, j
!$OMP CRITICAL
redarray_copy = redarray_copy + redarray
!$OMP END CRITICAL
enddo
Thansk a lot
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Sorry, I should have pointed it out before. Cheers.
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I'd recommend both COPYIN and REDUCTION over loop-based initialization/finalization, sincethey might be a little bit more efficient, especially the REDUCTION. REDUCTIONs are optimised for the target machine and don't have the global overhead of unnamed CRITICAL sections. If you prefer a CRITICAL section, I suggest using a named CRITICAL section...but don't use the same name you've given to the COMMON block (comarray). That's because CRITICAL section namesare in the same namespace as COMMON block names, and conflicts can result in undefined behaviour.
Patrick Kennedy