- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[fortran] module MyModule type :: MyType integer, dimension(:), pointer :: pIntList => null () end type end module MyModule program TestOpenMP_1_AllocationWithinSubroutines use MyModule use omp_lib implicit none type(MyType) :: instance integer :: threadId integer :: I !$omp parallel private (instance, threadId, I) threadId = OMP_GET_THREAD_NUM() allocate (instance%pIntList(200 + threadId)) CALL IterateList(instance) !$omp end parallel read (*,*) contains subroutine IterateList(aInstance) type(MyType) :: aInstance integer, dimension(:), pointer :: pTempIntList => null() integer :: threadId threadId = OMP_GET_THREAD_NUM() allocate (pTempIntList(size(aInstance%pIntList))) write (*,*) 'Thread ',threadId, ' - ', 'aInstance%pIntList: ', size(aInstance%pIntList) write (*,*) 'Thread ',threadId, ' - ', 'pTempIntListe: ', size(pTempIntList) end subroutine end program TestOpenMP_1_AllocationWithinSubroutines [/fortran]As the above code sample shows, the main program tries to utilize OpenMP to call a subroutine. In that subroutine, a local pointer variable is created (allocated). Its size depends on the actual argument. The output shows the local pointer variable does NOT have the same size as the actual argument, which however should not be the case based on the code.
Could you help to comment why the program gives such behavior? More importantly, could you help to comment the correct ways (best practices) concerning how to do allocation of pointers in subroutines when using OpenMP?
The version of Intel Fortran Compiler is:
[root@localhost new]# ifort --version ifort (IFORT) 12.1.0 20111011 Copyright (C) 1985-2011 Intel Corporation. All rights reserved.
The output using ifort is shown below:
[root@localhost new]# ifort omp.f90 -warn -check -g -trace -openmp -static [root@localhost new]# ./a.out Thread 5 - aInstance%pIntList: 205 Thread 5 - pTempIntListe: 207 Thread 1 - aInstance%pIntList: 201 Thread 1 - pTempIntListe: 207 Thread 3 - aInstance%pIntList: 203 Thread 3 - pTempIntListe: 207 Thread 0 - aInstance%pIntList: 200 Thread 0 - pTempIntListe: 207 Thread 2 - aInstance%pIntList: 202 Thread 2 - pTempIntListe: 207 Thread 4 - aInstance%pIntList: 204 Thread 4 - pTempIntListe: 207 Thread 7 - aInstance%pIntList: 207 Thread 7 - pTempIntListe: 207 Thread 6 - aInstance%pIntList: 206 Thread 6 - pTempIntListe: 207
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try using:
recursive subroutineIterateList(aInstance)
Compiling with openmp enabled should have effectively done this for you.
Reason being, you want the array descriptor for pTempIntListe to be on stack as opposed to SAVE.
If this fixes the problem, then the bug is -openmp not making local arrays stack based (i.e. leaving as SAVE)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
subroutine IterateList(aInstance)
type(MyType) :: aInstance
integer, dimension(:), pointer :: pTempIntList => null()
integer :: threadId
threadId = OMP_GET_THREAD_NUM()
!$OMP CRITICAL
write(*,*) 'loc(aInstance)', loc(aInstance)
write(*,*) 'associated(aInstance%pIntList)', associated(aInstance%pIntList)
allocate (pTempIntList(size(aInstance%pIntList)))
write (*,*) 'Thread ',threadId, ' - ', 'aInstance%pIntList: ', size(aInstance%pIntList)
write (*,*) 'Thread ',threadId, ' - ', 'pTempIntListe: ', size(pTempIntList)
end subroutine
--------------
If that does not disclose anything, then try
volatile integer :: myTurn
myTurn = 0
!$omp parallel private (instance, threadId, I)
threadId = OMP_GET_THREAD_NUM()
do while(myTurn .lt. threadId)
sleepqq(1)
end do
allocate (instance%pIntList(200 + threadId))
CALL IterateList(instance)
myTurn = myTurn + 1
!$omp end parallel
The second test will show if the array size is stored in a shared location (if the size printed out follows size expected).
This information may aid in finding the bug (be it in your program or in IVF)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's not an optimisation bug, nor a bug with -openmp not putting the array descriptor for pTempIntListon the stack. I verified that each thread gets a unique allocation for pTempIntList in the subroutine.
I believe this is a known bug with the Fortran front-end not putting arguments to size() on the private OpenMP list, in this case, aInstance%pIntList and pTempIntList.
Workaround is to put a critical section in the subroutine around the thread-specific code:
!$omp critical
threadId = OMP_GET_THREAD_NUM()
allocate (pTempIntList(size(aInstance%pIntList)))
write (*,*) 'Thread ',threadId, ' - ', 'aInstance%pIntList: ', size(aInstance%pIntList)
write (*,*) 'Thread ',threadId, ' - ', 'pTempIntListe: ', size(pTempIntList)
!$omp end critical
>ifort -Qopenmp U101975-critical.f90
Intel Visual Fortran Intel 64 Compiler XE for applications running on Intel 64, Version 12.1.2.278 Build 20111128
>U101975-critical.exe
Thread 2 - aInstance%pIntList: 202
Thread 2 - pTempIntListe: 202
Thread 0 - aInstance%pIntList: 200
Thread 0 - pTempIntListe: 200
Thread 1 - aInstance%pIntList: 201
Thread 1 - pTempIntListe: 201
Thread 3 - aInstance%pIntList: 203
Thread 3 - pTempIntListe: 203
>
Incidently, gfortran has a similar issue (with your original code); but the reason could be entirely different:
> gfortran --version
GNU Fortran (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]
Copyright (C) 2010 Free Software Foundation, Inc.
> gfortran -fopenmp U101975_gfortran.f90 && ./a.out
Thread 0 - aInstance%pIntList: 200
Thread 0 - pTempIntListe: 201
Thread 2 - aInstance%pIntList: 202
Thread 2 - pTempIntListe: 201
Thread 1 - aInstance%pIntList: 201
Thread 1 - pTempIntListe: 201
Thread 3 - aInstance%pIntList: 203
Thread 3 - pTempIntListe: 201
>
Once the developers confirm it's a known bug, I'll update this thread with the tracking number.
Patrick Kennedy
Intel Developer Support
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
However, if the critical section corrects the size(foo) issue, then this is indicative that the ALLOCATE(foo) code is inserting the allocation size into the out of private scope array descriptor allocation size location.
IOW you have two problems:
size(foo)uses allocation sizeof wrong descriptor
ALLOCATE(foo) is inserting allocation size into wrong descriptor
When looking at one problem, be aware of, and look for, other problem.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed, therecould be multiple defects in play here. The Fortran developers were unable to confirm if this a known issue, so this has been reported as a new defect, tracking number DPD200177907. I'll keep this thread updated with news from the developers.
Patrick Kennedy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
After a long investigation, it turns out this is a coding error. Jim was on the right track when he said "you want the array descriptor for pTempIntListe to be on stack as opposed to SAVE.". However, it's not the case that " the bug is -openmp not making local arrays stack based (i.e. leaving as SAVE)".
The explicit initialization of pTempIntList inside routine IterateList to null() forces it to be static, by Fortran language rules:
See 12-007 F2008 section 5.2.3 p3 n[89:12-13]
Explicit initialization of a variable that is not in a common block
implies the SAVE attribute, which may be confirmed by explicit
specification.
But when we're compiling this routine, there is no way that the FE can determine that the routine might be called from inside a parallel region in some other piece of code. As a result, we can't detect the error (i.e. using a static variable inside a parallel region - which is not thread-safe) and give a nice message.
The fix is simply to remove the initialization:
subroutine IterateList(aInstance)
type(MyType) :: aInstance
integer, dimension(:), pointer :: pTempIntList
With that change, all is good:
[U101975]$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 15.0.1.133 Build 20141023
Copyright (C) 1985-2014 Intel Corporation. All rights reserved.
[U101975]$ ifort -openmp U101975.f90 -o U101975.x
[U101975]$ export OMP_NUM_THREADS=4
[U101975]$ ./U101975.x
Thread 0 - aInstance%pIntList: 200
Thread 0 - pTempIntListe: 200
Thread 1 - aInstance%pIntList: 201
Thread 1 - pTempIntListe: 201
Thread 2 - aInstance%pIntList: 202
Thread 2 - pTempIntListe: 202
Thread 3 - aInstance%pIntList: 203
Thread 3 - pTempIntListe: 203
[U101975]$
Patrick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ooh, yea, I should have seen that. I will remember this for next time. FWIW this should be a reminder to NOT initialize the variables in the declaration section when (if) the subroutine or function will ever be used in openmp or recursively.
Thanks Patrick for explaining this in further detail.
This would be a good example for one of those "what's wrong with this program" questions.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler could give a "usage informational" if it sees local variable initialization when options imply that the routine should be recursive (and in F2015, all routines are recursive by default.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That would be great. Especially for those wanting to newer semantics. This diagnostic would be valuable in trying a newer level semantic (2008, 2015, ...) and being notified of statements to be examined for potential problems.
>>(and in F2015, all routines are recursive by default.)
Will this change the rule:
Explicit initialization of a variable that is not in a common block implies the SAVE attribute, which may be confirmed by explicit specification.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, it doesn't change that rule.
I'll comment that we don't have options to specify "level" of standard semantics. The compiler has a particular level it considers current. What the change will do is cause more incorrect programs with uninitialized variables to show errors.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page