- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I recently encountered a problem with parallel do loop with OpenMP in Fortran. The following code can reproduce what I saw. I would appreciate any suggestions. I use Visual Studio 2017 + Intel Parallel Studio XE 2018 Update 1.
program Rang_Test !$ use omp_lib implicit none integer, parameter :: ni = 10 real :: rnd(8), fnl(ni) integer :: i, j real, allocatable :: rtmp allocate(rtmp) rtmp = 1. !$omp parallel default(none) private(i, rnd) firstprivate(rtmp) shared(fnl) !$omp master write(*, *) 'Initial values' !$omp end master write(*, '(*(g0))') 'Core=', omp_get_thread_num(), ' rTmp=', rtmp !$omp barrier !$omp master write(*, *) 'Inside the loop' !$omp end master !$omp do do i = 1, 10 call RANDOM_NUMBER(rnd) rtmp = rnd(1) fnl(i) = rtmp !$omp critical write(*, '(*(g0))') 'i=',i,' rnd(1)=', rnd(1), ' rTmp=', rtmp !$omp end critical end do !$omp end do !$omp end parallel write(*, *) 'After omp region' do i = 1, 10 write(*, '(*(g0))') 'i=', i, ' fnl(i)=', fnl(i) end do end program Rang_Test
In the example above, in each iteration, some calculations are done first (call random_number(rnd)), then the results are assigned to a firstprivate variable (rtmp), and some more calculations are done in the same iteration (fnl(i) = rTmp). In this sample code, inside the loop, the values of rTmp should always equal to rnd(1). However, I found that the values written to the screen were not what I expected. The following are the execution results on my computer:
Initial values Core=2 rTmp=1.000000 Core=0 rTmp=1.000000 Core=3 rTmp=1.000000 Core=5 rTmp=1.000000 Core=4 rTmp=1.000000 Core=7 rTmp=1.000000 Core=6 rTmp=1.000000 Core=1 rTmp=1.000000 Inside the loop i=6 rnd(1)=.3001758 rTmp=.7522959 i=8 rnd(1)=.7958636 rTmp=.7522959 i=5 rnd(1)=.1966502E-01 rTmp=.7522959 i=3 rnd(1)=.3920868E-06 rTmp=.7522959 i=9 rnd(1)=.8392264 rTmp=.1941571 i=7 rnd(1)=.4387013 rTmp=.1941571 i=1 rnd(1)=.7522959 rTmp=.1941571 i=10 rnd(1)=.7564077E-01 rTmp=.2656559 i=4 rnd(1)=.1941571 rTmp=.2656559 i=2 rnd(1)=.2656559 rTmp=.2656559 After omp region i=1 fnl(i)=.7522959 i=2 fnl(i)=.2656559 i=3 fnl(i)=.3920868E-06 i=4 fnl(i)=.1941571 i=5 fnl(i)=.1966502E-01 i=6 fnl(i)=.3001758 i=7 fnl(i)=.4387013 i=8 fnl(i)=.7958636 i=9 fnl(i)=.8392264 i=10 fnl(i)=.7564077E-01
As you can see, the final values (fnl) are fine, but the values of rTmp are problematic. I have tried a few other things and found:
- If rTmp is a regular nonallocatable variable, then the program works fine.
- Or, if rTmp is allocated to be an array, and only the first value is used (that is, use rTmp(1) = rnd(1)), then it works fine.
- I tried to compile the same code with gfortran, and it worked fine.
I would appreciate if anyone could give any suggestions. Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The behavior is like (assumption)
each thread is getting a first private copy of the pointer to the allocatable scalar (not a private copy of the scalar itself).
To correct for this:
!** allocate(rtmp) !** rtmp = 1. !$omp parallel default(none) private(i, rnd) firstprivate(rtmp) shared(fnl) allocate(rtmp) !** allocate inside parallel region (to private pointer to scalar) rtmp = 1. !** assign here to private pointer to scalar !$omp master
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank Jim, I will try this in my actual code.
![](/skins/images/91F5C79BC69312EC7F389BB9532EE3D4/responsive_peak/images/icon_anonymous_message.png)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page