Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29275 토론

ifort 11 (OS X): openmp causes array decl. problems (was ok with ifort 10)

burgel
초급자
502 조회수

(I submitted this to premier support already, but just in case somebody else has run into this I thought I'd add a thread)

Briefly, I have a subroutine with hand-coded openmp directives and local routines in a CONTAINS environment. A very strange thing happens to at least two passed-in array declarations when the openmp directives (even just one of them) are activated with -openmp.

SUBROUTINE ADVECT(s,s0,fs,u,v,w, &

gxt,gyt,gzt,rrp,rrm,dt, &

nx,ny,nz, &

svar,atype,izero0)

USE GRID_MODULE

USE CPUTIME_MODULE

USE PARAM_MODULE

implicit none

integer, INTENT(IN) :: nx, ny, nz, atype, izero0

real, INTENT(INOUT) :: s (-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(INOUT) :: s0(-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(INOUT) :: fs(-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(IN) :: u (-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(IN) :: v (-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(IN) :: w (-ng+1:nx+ng,-ng+1:ny+ng,-ng+1:nz+ng)

real, INTENT(IN) :: rrp(-ng+1:nz+ng), rrm(-ng+1:nz+ng)

real, INTENT(IN) :: dt

(ng is declared in a module)

At the beginning of the code I added this:

write(0,*) 'ADVECTRK: nx,ny,nz,ng = ',nx,ny,nz,ng

write(0,*) 'size of s: ',size(s,1),size(s,2),size(s,3)

write(0,*) 'size of s0: ',size(s0,1),size(s0,2),size(s0,3)

write(0,*) 'size of u: ',size(u,1),size(u,2),size(u,3)

write(0,*) 'size of rrp,gxt: ',size(rrp,1),size(gxt,1),size(gxt,2)


And without -openmp, the sizes are correct:

nx=41,ny=41,nz=41,ng=3

size(s,1)=47, size(s,2)=47, size(s,3)=47.

But with -openmp, somehow I get:

ADVECTRK: nx,ny,nz,ng = 41 41 41 3

size of s(1,2,3): 10 7 10

size of s0(1,2,3): 10 7 10

size of u(1,2,3): 10 7 10

size of rrp,gxt(1,2): 10 10 4

nx=41,ny=41,nz=41,ng=3 (all correct), but

size(s,1)=10, size(s,2)=7, size(s,3)=10 (all incorrect).

So even though nx,ny,nz, and ng are correct, somehow the size of S is wrong, and I get BAD_ACCESS errors as a result. (This appears to be happening to all of the 3D arrays that I have checked.) Also, this is running just one thread. Same problem for both ia32 and 64bit.

If I declare the u array arbitrarily, I can determine that constant values are being set (nx=4,ny=1, nz=4) in the declarations, regardless of the actual values of nx,ny, and nz (which are always printed out correctly.

This code runs fine under ifort 10.1.014 (I haven't run the latest version 10). There seems to be something "special" about this subroutine because none of the other subroutines in our code base exhibits this problem. I suspect it has to do with containing a number of internal subprograms. Fun, eh?

The compile options are

ifort -openmp -O0 -zero -g -CB -align all -ftz -I../src/include -I/opt/local/netcdf4m32/include -I/opt/local/hdf5m64/include -c ../src/advectrk.F90

(I'm not using the XCode environment -- everything is done in terminal


0 포인트
4 응답
jimdempseyatthecove
명예로운 기여자 III
502 조회수


Burgel

Something for you to try. On your array declarations try adding automatic

real, automatic,INTENT(IN) :: rrp(-ng+1:nz+ng), rrm(-ng+1:nz+ng)

Do this to all the declarations.

The purpose is to ensure that the array descriptors are located on the stack (as opposed to static storage).

Alternately you can use/Qauto.

I had experienced a similar problem where the default for the array descriptor used to be stack local but the version change (or my ineptitude) caused them to be placed in static storage. Although this won't matter for a single threaded application, it does matter for multi-threaded applications.

Jim Dempsey

0 포인트
Steven_L_Intel1
502 조회수

-openmp implies -auto.

0 포인트
burgel
초급자
502 조회수

-openmp implies -auto.

The reply on premier support also suggested testing with -automatic instead of -openmp. I tried that, and the code still runs correctly with -automatic.

I have a much smaller test program now that exhibits the problem, and it seems to be problem with having subroutines within a CONTAINS. It doesn't matter if the openmp loop is in the contained subroutine or in the main subroutine. So there is some kind of bad interaction going on when the loop is parallelized.

I uploaded the code example to premier support, but I've also tried to post a similar one here if it helps (advectrktest.F90) (I don't see it showing up anywhere, though ... It is in a folder called 'ted')

compiled with "ifort -openmp -O0 -g -o x.test advectrktest.F90" on OS X (10.5.6) and ifort 11.0.056

-- Ted

0 포인트
Steven_L_Intel1
502 조회수

See here for info on how to attach files - note steps 5-7.

0 포인트
응답