Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29285 Discussions

OpenMP data race when calling pure and recursive subroutine

Richard_Gordon
Beginner
1,925 Views
Hi All,

I am trying to parallelize a loop that calls a subroutine "sub_VFI" that is declared pure and recursive.

[fortran] !$OMP PARALLEL DO DEFAULT(SHARED) PRIVATE(eind,mind)
do mind = 1,len_m
do eind = 1,len_e
call sub_VFI(vR(:,eind,mind),MPL(eind,mind),qapTmp(:,eind,mind),&
grid_a,grid_a,ap_eq0_ind, len_aTmp,len_apTmp,len_e,len_m)
end do
end do
!$OMP END PARALLEL DO[/fortran]
The intents are as follows: vR is out, everything else is in. Also all of the in arrays are automatic arrays.

It doesn't work in parallel (it does in serial). Specifically, it seems to execute sub_VFI for each eind and mind but then can't exit the loop. Intel Inspector says there is a data race condition.

I have no idea what could be wrong with this code so if you could give me some clues as to what could be going wrong, that would be very helpful. I thought this would be perfectly fine as sub_VFI is supposed to be free from side effects and each thread given it's own memory.

Thanks so much for your time!
Grey

Edit: Maybe I should add that sub_VFI does use the value of global variables, but does not modify them. I don't think this is a problem, please correct me if I'm wrong.
0 Kudos
1 Solution
TimP
Honored Contributor III
1,925 Views
Right, it does look like you are correctly sending separate sections of the array to each thread, provided that no one over-runs the subscripts.
I think -heap-arrays is dangerous with OpenMP; I would think if it's a problem Inspector should have pointed out a specific point in the code, and you could see if removing -heap-arrays helps.
-heap-arrays 100 puts all arrays on the heap except for those which the compiler knows at compile time will never exceed size 100.

View solution in original post

0 Kudos
9 Replies
mriedman
Novice
1,925 Views
What type is vR ?Is it a POINTER or an ALLOCATABLE ?
Then you may try todeclare it INOUT instead of OUT.

The INTENT attribute is related to the association status of a pointer, not to the data it points to. I had instances where the compiler would justreinitialize (= nullify)a pointer in a subroutine ifit isOUT. If it is INOUT then it won't touch it.

Michael
0 Kudos
Richard_Gordon
Beginner
1,925 Views
Thanks for the reply Michael.

vR is a an automatic array both in the calling and receiving program and it is neither allocatable nor a pointer.
0 Kudos
TimP
Honored Contributor III
1,925 Views
An automatic array has to be local to the subroutine. You probably don't mean automatic, but we can't know without seeing the code.
Automatic arrays of a size sufficient to use OpenMP aren't reliable, at least in the sense that you get no message should you fail to get a full allocation. Thus, allocatable would be preferred. But you may not mean automatic.
0 Kudos
Richard_Gordon
Beginner
1,925 Views
Thanks for the reply Tim.

sub_VFI looks like

[fortran]pure recursive subroutine sub_VFI(vR,MPL,qap_ap,grid_a,grid_ap,&
ap_eq0_ind,len_a,len_ap,len_e,len_m)
use mod_common_var, only: discount
implicit none
integer, intent(in) :: len_a,len_ap,len_e,len_m
real(8), dimension(1:len_a), intent(out):: vR
real(8), intent(in) :: MPL
real(8), dimension(1:len_ap), intent(in) :: qap_ap
real(8), dimension(1:len_a), intent(in) :: grid_a
real(8), dimension(1:len_ap), intent(in) :: grid_ap
integer, intent(in) :: ap_eq0_ind

...
end subroutine sub_VFI[/fortran]

I thought those were automatic arrays. len_a and len_ap are both 300. Do you mean by "get a full allocation" that I might be running out of memory? When you say allocatable would be preferred, do you mean addaing allocatable as an atribute to all the arrays and converting intent out variables to intent inout?

Thank you,
Grey

pure recursive subroutine sub_VFI(vR,MPL,qap_ap,grid_a,grid_ap,ap_eq0_ind,len_a,len_ap,len_e,len_m)
use mod_common_var, only: discount
implicit none
integer, intent(in) :: len_a,len_ap,len_e,len_m
real(8), dimension(1:len_a), intent(out):: vR
real(8), intent(in) :: MPL
real(8), dimension(1:len_ap), intent(in) :: qap_ap
real(8), dimension(1:len_a), intent(in) :: grid_a
real(8), dimension(1:len_ap), intent(in) :: grid_ap
integer, intent(in) :: ap_eq0_ind

...
end subroutine subVFI
0 Kudos
TimP
Honored Contributor III
1,925 Views
No, these aren't automatic arrays, at least not inside this subroutine. They are provided by the calling subroutine (argument association). They could conceivably be automatic in the caller, but we can't see.
If any of the arrays happen to overlap (including the possibility of running outside the dimension of the original declaration), that could create a race which the compiler can't expect when parallelizing.
A possibility is that multiple threads are using the same OUT array. Ideally, inspector points out the problem, at least to the extent of identifying an offending array.
0 Kudos
Richard_Gordon
Beginner
1,925 Views
Thanks again for your response.

When you say there is a possibility "that multiple threads are using the same OUT array," I don't understand how that's possible. When I call sub_VFI with sub_VFI(vR(:,eind,mind),...), isn't it necessarily the case that each thread operates on distinct subsections of vR? Is it a problem if the threads try to write to different sections of vR at the same time? I didn't think it was, but I don't know omp very well. Or is your point just that if I have gone out of bounds, then I could be accessing the same portion of memory from different threads.

Separate question: Is it a problem if two threads read from the same data at the same time? As in both reading from grid_a at the same time. From what I've read I didn't think so.

Thank you,
Grey

PS I'm on version 12.0.3 and I'm using the following flags
-O3 -xHost -openmp -heap-arrays 100 -reentrancy threaded -mkl=sequential -fp-model source -gen-interfaces -g -debug minimal -traceback -check pointers -warn all -warn nounused
And I'm on an AMD processor 64 bit.
0 Kudos
TimP
Honored Contributor III
1,926 Views
Right, it does look like you are correctly sending separate sections of the array to each thread, provided that no one over-runs the subscripts.
I think -heap-arrays is dangerous with OpenMP; I would think if it's a problem Inspector should have pointed out a specific point in the code, and you could see if removing -heap-arrays helps.
-heap-arrays 100 puts all arrays on the heap except for those which the compiler knows at compile time will never exceed size 100.
0 Kudos
Richard_Gordon
Beginner
1,925 Views
Removing the heap arrays option did the trick. Thanks for all your help, I really appreciate it.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,925 Views
Tim,

Could you explain what is dangerous about OpenMP and -heap-arrays?
Other than the overhead ofmaking a heap allocation/deallocation

The fact that removal of -heap-arrays 100 "fixed" the symptoms does not necessarily mean it fixedthe problem. The placement in memory (and/or overhead in thread-safe code) should not affect the results. There are two alternate potential causes:

One you addressed indexing out of bounds on the intent(out) array.
Two, -heap-arrays likely will change the state of uninitialized variables more so than stack arrays.

Make that three: an alignment issue exposing a compiler optimization bug

I would recommend that Richard investigate this further.

Jim Dempsey
0 Kudos
Reply