Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Parallelizing an early-out loop

David_DiLaura1
New Contributor I
692 Views
I'm working on radiative transfer code and am trying to take advantage of multiprocessing. A particularly sensitive (that is, expensive) spot on the code is a loop which, about 1/2 the time, can end early with an EXIT. Officially, this can't be parallelized. I've tired removing the early-out EXIT and replacing it with a logical flag. The flag is checked at the start of the loop and if set, the loop is cycled. Naturally, the flag is not set when the loop is entered. This altered loop works without parallelizing, but is not as economic as the early-out. All the interations of the loop are performed, but after the flag is set, there is (much) less computational work to do. Nevertheless, all the iterations must be passed through. However, parallelizing this gains me nothing -- actually it's worse.

Any ideas? The flag set-and-check business is a uncraftsman-like cludge but I haven't found a way out. Perhaps this isn't an occasion where parallelization can work. The usual "re-work the algorithm" does not help in this case: early-out, is early-out.

David
0 Kudos
2 Replies
jimdempseyatthecove
Honored Contributor III
692 Views

David,

From your post I guess you are using OpenMP and are referring to a !$omp parallel do loop

One option you have is to use a parallel region and then explicitly specify/determine the slices of the array.

[cpp]done = .false.
!$omp parallel shared(done, count) private(ThreadNum, nThreads, sliceSize, iBegin, iEnd, i)
ThreadNum = omp_get_thread_num()
nThreads = omp_get_num_threads()
sliceSize = count / nThreads 
if(sliceSize .eq. 0) sliceSize = 1
iBegin = sliceSize * ThreadNum + 1 ! (1 based array)
if(iBegin .lt. count) then
  ! thread number not .ge. count
  iEnd = iBegin + sliceSize - 1
  if(iEnd .gt. count) iEnd = count
  do i=iBegin, iEnd
    if(done) exit
    ...
    if(ConvergedOnSolution()) done = .true.
    ! do not use done = ConvergedOnSolution()
    ! as this will (can) unset done
  end do
end if
!$omp end parallel
[/cpp]
The above do loop is now not a parallel do and therefore you can use exit

The code to perform the array slicing can be place intofunctions

[cpp]done = .false.
!$omp parallel shared(done, count) private(i)
do i=iBegin(count), iEnd(count)
  if(done) exit
  ...
  if(ConvergedOnSolution()) done = .true.
  ! do not use done = ConvergedOnSolution()
  ! as this will (can) unset done
end do
!$omp end parallel
...
! 1-based arrays with count number of cells
integer function iBegin(count)
  integer :: count
  integer :: sliceSize
  sliceSize = count / omp_get_num_threads()
  if(sliceSize .eq. 0) sliceSize = 1
  iBegin = (sliceSize * ThreadNum) + 1 ! (1 based array)
end function iBegin

integer function iEnd(count)
  integer :: sliceSize
  sliceSize = count / omp_get_num_threads()
  if(sliceSize .eq. 0) sliceSize = 1
  iEnd = sliceSize * (ThreadNum + 1) ! (1 based array)
  if(iEnd .gt. count) iEnd = count
end function iEnd
[/cpp]

Jim Dempsey
0 Kudos
jimdempseyatthecove
Honored Contributor III
692 Views

Please test the functions for errors as I have not run that code

Jim
0 Kudos
Reply