- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hardware is an SGI ice system.
FORTRAN code with MPI+OpenMP.
There are multiple OpenMP regions.
There seems to be a problem with just one of them. If this particular routine is not compiled with OpenMP it runs fine on one thread or multiple threads.
With this region compiled with OpenMP:
- Code works with OMP_NUM_THREADS set to 1
- Crashes when set to anything other than 1
The region that has trouble has the following setup:
- There is a loop that has been parallelized using OpenMP.
- In the loop (hence in the parallel region) there is a chain of calls.
I want to know the scope of some of the variables in one of those routines.
I was wondering if it is possible to get this information.
Also, can the tools such as inspector be used in this situation (MPI+OpenMP codes)?
Any ideas on troubleshooting this problem?
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
...OR...
The prior state of the code with an undesignated array/defined type, which when compiled without OpenMP or without RECURSIVE attribute behaves as if it had SAVE attribute, but when compiled with OpenMP behaves as if it has AUTOMATIC attribute (for non-master threads). Thus resulting in you having uninitialized variables for the non-master threads of that region.
...OR...
You are not using COPYIN in the appropriate place.
If nothing shows up as apparent along the above lines, then I suggest you insert some sanity checks into your code (ASSERTS as used in C/C++).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In looking at the code some more, I see that the routines in question are "CONTAIN"ed in a module, and I do remeber a few months ago that I had seen problems with Intel compilers when using OpenMP with contained routines (this was in my previous job). In that case it was a simple code, and simply moving the "CONTAIN"ed routine outside the module solved the problem.
Is anyone aware of this issue? I wish I still had that simple code to post here.
I should clarify that in that simple example there weren't any "module variables" which would effectively be "save"d variables. In fact, in that simple case, it was simply a matter of moving that subroutine out of the module.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
real :: array(123)
type(yourType) :: goo
In the above:
when NOT -openmp, array and goo are SAVE
when -openmp, the array and goo are on stack
(same applies without/with RECURSIVE on subroutine foo)
option switches can override this.
Check for any arrays and/or user defined types in the subroutine that is causing the issue.
There is another thread somewhere on this forum relating to
CONTAIN-ed subroutine having !$OMP PARALLEL...
.AND. when the main program does not establish the OpenMP thread pool.
If this is your case, try adding to your main (PROGRAM) before you call the CONTAIN'ed subroutine:
!$OMP PARALLEL
if(omp_get_thread_num() .eq. 999999) STOP
!$OMP END
The if test should always fail (not call stop), but is inserted there to assure the compiler does not optimize out dead code.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've just been running Inspector, on an application which was checked by its predecessor several years ago. Inspector flagged a few scalar variables which needed private specification but hadn't been caught earlier, but it also missed some, and it missed local arrrays which needed private.
The usual recommendation on the Fortran forums is to add the RECURSIVE qualifier to all subroutines which will be called in a parallel region. An alternative is to compile with options which imply similar status for local variables and arrays (/Qauto or /Qopenmp for ifort, no /Qsave).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The example code segflts with ifc/12-12.1.0_sp1.6.233 and also ifc/12-12.0.4.191.
It runs successfully using gfortran.
The strange thing is function RMS is not even used!!!
Any suggestions? Thanks.
[bash]
program as43 implicit none ! ifort -openmp as43.f90 -o as43 ! set n=4; setenv OMP_NUM_THREADS ${n} ; time ./as43 integer DIM parameter (DIM=4000) real v_old(0:DIM,0:DIM),v_new(0:DIM,0:DIM) integer i, j, k write (*,*) "DIM * DIM = ", DIM * DIM do i=0,DIM do j=0,DIM v_old(i, j) = 0.25 enddo enddo do k=1,20 !do-loop over iterations. !$omp parallel !$omp do do j=1,(DIM-1) !do-loops to generate new values for do i=1,(DIM-1) ! all interior points. v_new(i,j) = (v_old(i-1,j) + v_old(i+1,j) + v_old(i,j-1) + v_old(i,j+1))/4.0 enddo enddo !$omp end do !$omp end parallel enddo ! k loop print *, 'zzzzzzzzzz ', v_new(1,1) contains real function rms() integer i, j real myrms myrms = 0.d0 do i=1,DIM-1 do j=1,DIM-1 myrms = myrms + (v_old(i,j)-v_new(i,j))**2 enddo enddo rms = sqrt(myrms)/(DIM-1) return end function rms end program as43 [/bash]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
!$omp parallel private(j, i)
!$omp do
do j=1,(DIM-1) !do-loops to generate new values for
do i=1,(DIM-1) ! all interior points
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
But that does not change the result. I added this to the code, and it still fails same way.
!$omp parallel private(j,i) shared(v_old,v_new)
!$omp do
In fact just commenting out the unused routine makes it work!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ulimit -s unlimited).
As for minor problems, it seems a little out of line to nest your non-openmp loops backwards; the compiler may switch them automatically, since they are outside the parallel region.
Also, I usually take the precaution of setting -prec-div -prec-sqrt, and ifort takes that as preventing optimization of /4.0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks again.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page