- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I need to perform a reduction on an array that is inside a named COMMON block.
When the reduction is in the same unit as the one where the parallel reduction is declared, everything goes fine.
When the reduction has to occur in subroutines that are called from the parallel region, then the result is not predictable.
The problem is rooted in the fact that a reduction induces a per thread array, but when the program reaches a subroutine it sees the global array only. If the array is declared threadprivate in the subroutine, then the output is not gathered.
If I mix-up threadprivate/reduction, then I have to use 2 different names for the common, which I would like to avoid.
Is there a design pattern or an OpenMP instruction to make as though the subroutines were inlined ?
It can be funny to look at the provided code in redko.f90, and play with the subroutine being in the same file or not as the main program.
Of course, I would be very grateful to any help, since I dearly need to overcome that tricky problem.
Oliver.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I took a closer look at making the common block 'reduction' array threadprivate:
subroutine setvalue(i,j)
IMPLICIT NONE
integer :: redarray(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
integer :: i,j
redarray(j) = redarray(j) + i
end subroutine setvalue
That works fine, you just need to add up the partial sums after the OpenMP loop calling the subroutine (similar to what Jim suggested). Something like:
integer :: redarray(2), redarray_copy(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
....
redarray = 0
!$OMP PARALLEL COPYIN(redarray)
!$OMP DO
do i=1, n
call setvalue(i,indir(i))
enddo
!$OMP END DO
!$OMP SINGLE
j = omp_get_num_threads()
redarray_copy = 0
!$OMP END SINGLE
!$OMP DO REDUCTION(+:redarray_copy)
do i=1,j
do k=1,2
redarray_copy(k) = redarray_copy(k) + redarray(k)
enddo
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
Patrick
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If redarray was not a member of a common block, then you could pass it in the argument list of the subroutine, and the compiler would then know that it's a reduction variable, and the race conditional wouldelminated.
So, make a copy of redarray, pass the copy in the subroutine call argument list, and then update the common block array after the parallel region:
redarray_copy = 0
!$OMP PARALLEL
!$OMP DO REDUCTION(+:redarray_copy)
do i=1, n
call setvalue(i,indir(i),redarray_copy)
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
The subroutine would then become
subroutine setvalue(i,j,subarray)
IMPLICIT NONE
integer, intent(inout) :: subarray(2)
integer :: i,j
subarray(j) = subarray(j) + i
end subroutine setvalue
This implementation is kinda ugly, but correct (race-free).
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I wished there were some other means to tell setvalue that it handles a reduction array...
The closest thing to what I would like in terms of minimal code change looks like the threadprivate/reduction stuff as in reduc.f90/setval.f90 below, but I found no way to use only one common block. In the real code, setval may be in parallel or non parallel context, as being called from different code sections.
Oliver.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After a short look at your code, you really do not have what constitutes a "reduction array".
The fly in your ointment is your indir(n) array containing 1,2,1,2,1,2,...
Resulting in a parallel loop, using x threads, xnot equal2, stomping on each others redarray(indir(i))
Consider something along the line of:
!$OMP PARALLEL
numThreads = omp_get_num_threads()
myThread = omp_get_thread_num()
if(myThread .lt. n-1) then
do i=1, n
if(mod(indir(i), numThreads) .eq. myThread) then
redarray(indir(i)) = redarray(indir(i))+ 1
endif
end do
endif
!$OMP END PARALLEL
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I took a closer look at making the common block 'reduction' array threadprivate:
subroutine setvalue(i,j)
IMPLICIT NONE
integer :: redarray(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
integer :: i,j
redarray(j) = redarray(j) + i
end subroutine setvalue
That works fine, you just need to add up the partial sums after the OpenMP loop calling the subroutine (similar to what Jim suggested). Something like:
integer :: redarray(2), redarray_copy(2)
common /comarray/ redarray
!$OMP threadprivate(/comarray/)
....
redarray = 0
!$OMP PARALLEL COPYIN(redarray)
!$OMP DO
do i=1, n
call setvalue(i,indir(i))
enddo
!$OMP END DO
!$OMP SINGLE
j = omp_get_num_threads()
redarray_copy = 0
!$OMP END SINGLE
!$OMP DO REDUCTION(+:redarray_copy)
do i=1,j
do k=1,2
redarray_copy(k) = redarray_copy(k) + redarray(k)
enddo
enddo
!$OMP END DO
!$OMP END PARALLEL
redarray = redarray_copy
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To initialize, do you reckon a COPYIN or a mere:
!$OMP DO
do i=1,j
redarray = 0
enddo
To finalize, do you reckon a REDUCTION or a
!$OMP DO
do i=1, j
!$OMP CRITICAL
redarray_copy = redarray_copy + redarray
!$OMP END CRITICAL
enddo
Thansk a lot
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I should have pointed it out before. Cheers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'd recommend both COPYIN and REDUCTION over loop-based initialization/finalization, sincethey might be a little bit more efficient, especially the REDUCTION. REDUCTIONs are optimised for the target machine and don't have the global overhead of unnamed CRITICAL sections. If you prefer a CRITICAL section, I suggest using a named CRITICAL section...but don't use the same name you've given to the COMMON block (comarray). That's because CRITICAL section namesare in the same namespace as COMMON block names, and conflicts can result in undefined behaviour.
Patrick Kennedy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks !
Oliver.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page