Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

OpenMP error

gib
New Contributor II
1,753 Views

I'm using the latest ifort on an Intel quad-core machine running Windows XP. My OpenMP program executes correctly most of the time, but occasionally it fails when it reaches the first parallel section. The error message is:

forrtl: severe (173): A pointer passed to DEALLOCATE points to an array that cannot be deallocated.

The error is shown to be associated with code in libiomp5md.dll.

The surprising thing is that there is no allocation or deallocation taking place anywhere near the failure point. The other odd thing is that a minor change in the code (e.g. a write statement) might make the program run OK. The parallel code being executed is quite simple, but it is embedded in a much larger program, making isolation of a simple example not easy enough for me to be motivated to do it just yet.

The omp command for the parallel section is:

!$omp parallel private(kcell,kpar,x_lo,x_hi,cell,site1,xlocal,indx,slot,go,cnt)
Most of the private variables are either scalars or arrays of scalars, but one, cell, is a more complex object:

type(cell_type) :: cell
type noncog_type
sequence
real :: entrytime
end type

type cog_type
sequence
real :: affinity ! level of TCR affinity with DC
real :: stimulation ! TCR stimulation level
real :: entrytime ! time that the cell entered the paracortex (by HEV or cell division)
real :: dietime ! time that the cell dies
real :: dividetime ! time that the cell divides
real :: stagetime ! time that a cell can pass to next stage
real :: IL_state(CYT_NP) ! receptor model state variable values
real :: IL_statep(CYT_NP) ! receptor model state variable time derivative values
integer :: status ! holds data in bytes: 1=stage, 2=generation
integer :: cogID ! index in the list of cognate cells
end type

type cell_type
sequence
integer :: ID
integer :: site(3)
integer :: step
integer(2) :: ctype
integer(2) :: lastdir
integer(2) :: DCbound(2) ! DCbound(k) = bound DC, allow binding to MAX_BIND DC, MAX_BIND <= 2
real :: unbindtime(2)
type(noncog_type), pointer :: ptr1 => NULL()
type(cog_type), pointer :: ptr2 => NULL()
integer :: visits,revisits,ndclist
integer(2), allocatable :: dclist(:) ! list of DC visited by a T cell
end type

It occurs to me that the allocatable array dclist(:), or perhaps the pointers, might be the source of the trouble. Am I doing something illegal?

Thanks

Gib

0 Kudos
15 Replies
gib
New Contributor II
1,753 Views

I can report that the error is indeed related to the allocatable array dclist(:). When I replace this by a simple array the program runs successfully. In the cases that I've been running the allocatble array was not used, and not allocated. It seems that in parallelizing that section, and preparing the private variables, the compiler may be attempting to deallocate dclist(:), although it is not currently allocated.

0 Kudos
Steven_L_Intel1
Employee
1,753 Views

I think we need to see more of the code. The error is not happening within libiomp5 but in your code. The issue is not an "already deallocated" array, for which you'd get a different error, but that an attempt has been made to deallocate a pointer that does not point to something that can be deallocated (an array slice, for example). The only way you'd get this for an allocatable array is data corruption.

Can you identify the exact statement where the error occurs?

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,753 Views


My guess as to what is happening is "cell" is being declared as private for the parallel region and the !$omp statement does not include a copyin for cell. i.e. it instantiates an uninitialized cell. Try adding copyin(cell) to the !$OMP line.

Sometimes I find it easier to make a subroutine call out of the block of code within the parallel region and not pass the private variables (just declare them in the subroutine). This makes the program easier to understand and debug at the minor expense of a subroutine call when compiling for non-OpenMP environment. Often the subroutine call is faster than the jump around code the compiler inserts into the code in order to run it with the appearance of being in-line.

Jim Dempsey

0 Kudos
gib
New Contributor II
1,753 Views

I think we need to see more of the code. The error is not happening within libiomp5 but in your code. The issue is not an "already deallocated" array, for which you'd get a different error, but that an attempt has been made to deallocate a pointer that does not point to something that can be deallocated (an array slice, for example). The only way you'd get this for an allocatable array is data corruption.

Can you identify the exact statement where the error occurs?

There is no allocation/deallocation either in the subroutine containing the parallel section, or within the parallel section (or anything called from there). When I do this in the definition of type cell_type:

! integer(2), allocatable :: dclist(:)
integer(2) :: dclist(2)

the code executes without error. As I mentioned, in the runs I'm testing with the member dclist(:) is never allocated for any of the cells, and, of course, dclist(:) is never used. For these reasons I believe the error is not in my code. I will try to pinpoint the statement generating the error. It is definitely near the '!$omp parallel' statement, since I see the result of writes before this.

I have now turned on traceback, and see that the error occurs in the execution of the statementat line 249

cell = cellist(kcell)

(note that cellist(:) is an array of objects of type cell_type). I have attached the whole subroutine, and the output. I haven't attached subroutine jumper - I can if you wish, or you can take my word for it that it does not refer to dclist(:). The error is in the first call to subroutine mover, and based on the time, I'm pretty sure it is the first execution of line 249 that causes the error. It's hard to be certain, because uncommenting any of the three write statements makes the error go away.

It is clear to me that the problem is with the member dclist(:) of cell, which as I have said is never allocated or accessed. My question then is this: should it be legal to use an instance of a derived type with an unallocated member in this way?

subroutine mover
integer :: kpar
integer :: site1(3), kcell, indx(2), slot, x, xlocal, cnt
logical :: go
type(cell_type) :: cell
integer :: x_lo,x_hi,sweep
integer :: i, nslice, sum, sump, nsweeps
integer, save :: xlim(8)
integer :: xtotal(0:8)
integer :: xcount(NX)

if (istep == 1) then ! must be executed when the blob size changes
xcount = 0
do kcell = 1,nlist
cell = cellist(kcell)
if (cell%ID == 0) cycle ! skip gaps in the list
i = cellist(kcell)%site(1)
xcount(i) = xcount(i) + 1
enddo
nslice = globalvar%NTcells/(2*num_cpu)
xlim(0) = 1
sum = 0
sump = 0
i = 1
do x = 1,NX
sum = sum + xcount(x)
if (sum > i*nslice) then
xlim(i) = x
xtotal(i) = sum - sump
sump = sum
write(*,*) 'i,sum: ',i,sum,xtotal(i)
i = i+1
if (i == 2*num_cpu) exit
endif
enddo
xlim(2*num_cpu) = NX
xtotal(2*num_cpu) = globalvar%NTcells - sum
write(*,*) 'i,sum: ',i,sum,xtotal(i)
endif

if (num_cpu > 1) then
!DEC$ IF .NOT. DEFINED (_OPENMP)
stop
!DEC$ ENDIF
endif

if (num_cpu == 1) then
nsweeps = 1
else
nsweeps = 2
endif

!write(*,*) 'start sweep loop'
do sweep = 0,nsweeps-1
!write(*,*) 'sweep: ',sweep

!$omp parallel private(kcell,kpar,x_lo,x_hi,cell,site1,xlocal,indx,slot,go,cnt)
kpar = omp_get_thread_num()
!write(*,*) 'kpar: ',kpar
do kcell = 1,nlist
if (num_cpu == 1) then
x_lo = 1
x_hi = NX
else
x_lo = xlim(sweep+2*kpar) + 1
x_hi = xlim(sweep+2*kpar+1)
endif
cell = cellist(kcell)
if (cell%ID == 0) cycle ! skip gaps in the list
if (cell%step == istep) cycle
if (cell%DCbound(1) /= 0 .or. cell%DCbound(2) /= 0) cycle ! skip cells bound to DC
site1 = cell%site
xlocal = site1(1)
if (xlocal < x_lo .or. xlocal > x_hi) cycle ! not in the slice for this processor
indx = occupancy(site1(1),site1(2),site1(3))%indx
if (indx(1) < 0) then
write(*,*) 'stage1: OUTSIDE_TAG or DC: ',kcell,site1,indx
stop
endif
if (kcell == indx(1)) then
slot = 1
elseif (kcell == indx(2)) then
slot = 2
else
write(*,'(a,7i6)') 'ERROR: stage1: bad indx: ',me,kcell,site1,indx
stop
endif
call jumper(kcell,indx,slot,go,kpar)
cellist(kcell)%step = istep
enddo
!$omp end parallel
enddo
end subroutine

OUTPUT:

Machine processors: 4
Threads: 4
Threads: 4
Threads: 4
Threads: 4
Read cell parameter file
dc_cummul_prob_nc:
0.07 0.35 0.59 0.76 0.86 0.92 0.96 0.98 0.99 1.00
dc_initial_cummul_prob_nc:
0.01 0.15 0.35 0.55 0.69 0.81 0.90 0.95 0.98 1.00

DELTA_X: 6.31 DCradius: 3.01
Approx nsites: 381702
outfile: 0 test_0.out
random_number seed size: 2
npar, par_jsrseed: 4 7006652 14013304 21019956 28026608
kpar, par_jsr: 0 7006652
kpar, par_jsr: 1 14013304
kpar, par_jsr: 2 21019956
kpar, par_jsr: 3 28026608
noncog_size,cog_size: 1 20
xoffset: 0 100
wx: 100
me,NXX,NXI,pshift: 0 100 99 0
dirprob:
0.650 0.213 0.012 0.012 0.012 0.012 0.012
0.012 0.012 0.012 0.009 0.009 0.009 0.009
0.001 0.001 0.001 0.001 0.001
sum: 0.9999998
LocalCentre: 50.50000 50.50000 50.50000
did array_init
place_cells
nlist,RESIDENCE_TIME: 382336 24.00000
NTcells, NDCalive, NTsites: 382336 0 382336
nlist: 382336
i,sum: 1 48320 48320
i,sum: 2 99148 50828
i,sum: 3 146924 47776
i,sum: 4 197544 50620
i,sum: 5 241592 44048
i,sum: 6 288780 47188
i,sum: 7 338352 49572
i,sum: 8 338352 43984
forrtl: severe (173): A pointer passed to DEALLOCATE points to an array that cannot be deallocated

Image PC Routine Line Source
omp_para.exe 004A9C6A Unknown Unknown Unknown
omp_para.exe 004A7529 Unknown Unknown Unknown
omp_para.exe 004518F5 Unknown Unknown Unknown
omp_para.exe 00441CDC Unknown Unknown Unknown
omp_para.exe 004144DD _OMP_MOTILITY_mp_ 249 omp_motility.f90
libiomp5md.dll 10001175 Unknown Unknown Unknown
libiomp5md.dll 10007422 Unknown Unknown Unknown
libiomp5md.dll 1000759A Unknown Unknown Unknown
libiomp5md.dll 1002949A Unknown Unknown Unknown
kernel32.dll 7C80B683 Unknown Unknown Unknown

0 Kudos
gib
New Contributor II
1,753 Views


My guess as to what is happening is "cell" is being declared as private for the parallel region and the !$omp statement does not include a copyin for cell. i.e. it instantiates an uninitialized cell. Try adding copyin(cell) to the !$OMP line.

Sometimes I find it easier to make a subroutine call out of the block of code within the parallel region and not pass the private variables (just declare them in the subroutine). This makes the program easier to understand and debug at the minor expense of a subroutine call when compiling for non-OpenMP environment. Often the subroutine call is faster than the jump around code the compiler inserts into the code in order to run it with the appearance of being in-line.

Jim Dempsey

Hi Jim, I'm afraid I don't understand your 'copyin' suggestion, since I've never used that. cell is initialized within the parallel section.

I've implemented your good suggestion of putting the parallel section in a subroutine. This executes without the error (so far), which is good, but doesn't help me to understand why my original code gets this error.

Gib

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,753 Views
Quoting - gib

Hi Jim, I'm afraid I don't understand your 'copyin' suggestion, since I've never used that. cell is initialized within the parallel section.

I've implemented your good suggestion of putting the parallel section in a subroutine. This executes without the error (so far), which is good, but doesn't help me to understand why my original code gets this error.

Gib


I will see if I can explain the copyin situation (don't have your code sample available while I write this)

cell is a user defined type containing (from recollection) "stuff" plus two pointers plus an allocatable array.

When you declare the instance of cell outside the parallel section the compiler inserts some initialization code to at least indicate the allocateble array is not allocated, and potentially to indicate the pointers are not pointing anywhere. This is good up to the point of the !$OMP

At the !$OMP... you specified that cell was to be a private data structure. !$OMP privatedata structures are not auto initialized so my guess is cell contains junk data upon entry. And therefrore the array may have appeared allocated.

Note, when you allocate to an allocated array (depending on the version of Fortran) this may be interpreted as a reallocate (as opposed to an error). The reallocation then determined that heap was corrupted (or will be corrupted on return of junk).

The use of COPYIN (see documentation) is to provide the private copy within !$OMP section the initial value(s) of the variables outside the parallel region (as opposed to junk). In this case, the initial values are the contents ofun-allocated array descriptor and unassociated pointer.

The use of COPYIN to fix this problem is not really a good suggestion because the copy operation may introduce a real copy of array data should your array in cell have been used (allocated) prior to entry into the parallel region and when your real intention is to disregard whatever is in there.

Making the code into a subroutine removes this ambiguiety

Jim Dempsey

0 Kudos
gib
New Contributor II
1,753 Views


I will see if I can explain the copyin situation (don't have your code sample available while I write this)

cell is a user defined type containing (from recollection) "stuff" plus two pointers plus an allocatable array.

When you declare the instance of cell outside the parallel section the compiler inserts some initialization code to at least indicate the allocateble array is not allocated, and potentially to indicate the pointers are not pointing anywhere. This is good up to the point of the !$OMP

At the !$OMP... you specified that cell was to be a private data structure. !$OMP privatedata structures are not auto initialized so my guess is cell contains junk data upon entry. And therefrore the array may have appeared allocated.

Note, when you allocate to an allocated array (depending on the version of Fortran) this may be interpreted as a reallocate (as opposed to an error). The reallocation then determined that heap was corrupted (or will be corrupted on return of junk).

The use of COPYIN (see documentation) is to provide the private copy within !$OMP section the initial value(s) of the variables outside the parallel region (as opposed to junk). In this case, the initial values are the contents ofun-allocated array descriptor and unassociated pointer.

The use of COPYIN to fix this problem is not really a good suggestion because the copy operation may introduce a real copy of array data should your array in cell have been used (allocated) prior to entry into the parallel region and when your real intention is to disregard whatever is in there.

Making the code into a subroutine removes this ambiguiety

Jim Dempsey

Thanks for the explanation Jim. In fact, if you look at my code you'll see that I use the variable cell before the parallel section (since istep = 1), therefore on entry to that section it contains good data (but an unallocated array).

In the middle of the night I realized that I had an error in my code. I had declared the array xlim(8), when it should be xlim(0:8). To my surprise, fixing this error has not stopped the program in its original form (with the 'private' variable list) from crashing as before. Although I say it with a trepidation born of long experience, it is beginning to look as if this could be a compiler problem. Whatever the cause of the crash, it is highly volatile, since adding a write statement or even changing an existing write statement makes the code run.

0 Kudos
gib
New Contributor II
1,753 Views

Steve, another bit of information. If I turn off optimization the program executes OK. 'Maximize speed' makes it crash.

Should I file a bug report, or do you still think the problem is in my code?

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,753 Views

>>I have now turned on traceback, and see that the error occurs in the execution of the statementat line 249

>>cell = cellist(kcell)

Make cell a pointer to cell_type and use

cell => cellist(kcell)

This will require cellist to have target attribute.

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
1,753 Views
Quoting - gib

Steve, another bit of information. If I turn off optimization the program executes OK. 'Maximize speed' makes it crash.

Should I file a bug report, or do you still think the problem is in my code?

There's no evidence either way at present. Optimizations can reveal coding errors that may not appear at lower optimization levels.

Can you identify an operation that is executing incorrectly?

0 Kudos
IanH
Honored Contributor III
1,753 Views
Quoting - gib

Thanks for the explanation Jim. In fact, if you look at my code you'll see that I use the variable cell before the parallel section (since istep = 1), therefore on entry to that section it contains good data (but an unallocated array).

The problem may still lie elsewhere, but to repeat and expand on what Jim has already said...

Because cell appears in a PRIVATE clause, cell within the !$OMP section refers to different storage (several different bits of storage in fact - one for each thread) than what cell refers to outside of the !$OMP section. The compiler does not initialise each thread's private copy of cell from the outside !$OMP cell unless you ask it to - this is the missing so called copy-in. So cell, immediately after entry to the parallel region, doesn't contain good data - just garbage. For the ordinary not-allocatables that's not a problem, because the first thing that you do in the parallel region is assign to them, but...

The thinking is that without initialisation of the private copy of cell, the allocation status of the dclist component is also garbage. Under Fortran 2003 (and in recent versions of IVF with /assume:reallloc_lhs) assignment to an allocatable can cause reallocation if the size or allocation status of the left hand side array is not suitable. In this case, having garbage in the allocation status of the left hand dclist is likely to cause the problems that you see, when the dclist component is begin assigned as part of the parent object's assignment (cell = celllist(kcell)).

If cell was in a FIRSTPRIVATE clause, then each thread's private copy of cell is initialised from the outside !$OMP cell. This obviously incurs some overhead. Try it and see whether the problem goes away.

IanH

0 Kudos
gib
New Contributor II
1,753 Views
Quoting - IanH

The problem may still lie elsewhere, but to repeat and expand on what Jim has already said...

Because cell appears in a PRIVATE clause, cell within the !$OMP section refers to different storage (several different bits of storage in fact - one for each thread) than what cell refers to outside of the !$OMP section. The compiler does not initialise each thread's private copy of cell from the outside !$OMP cell unless you ask it to - this is the missing so called copy-in. So cell, immediately after entry to the parallel region, doesn't contain good data - just garbage. For the ordinary not-allocatables that's not a problem, because the first thing that you do in the parallel region is assign to them, but...

The thinking is that without initialisation of the private copy of cell, the allocation status of the dclist component is also garbage. Under Fortran 2003 (and in recent versions of IVF with /assume:reallloc_lhs) assignment to an allocatable can cause reallocation if the size or allocation status of the left hand side array is not suitable. In this case, having garbage in the allocation status of the left hand dclist is likely to cause the problems that you see, when the dclist component is begin assigned as part of the parent object's assignment (cell = celllist(kcell)).

If cell was in a FIRSTPRIVATE clause, then each thread's private copy of cell is initialised from the outside !$OMP cell. This obviously incurs some overhead. Try it and see whether the problem goes away.

IanH

Thanks Ian. THis all makes sense, I think. But cell is used before the parallel section, and when I add copyin(cell) as Jim suggested, like this:

!$omp parallel private(kcell,kpar,x_lo,x_hi,cell,site1,xlocal,indx,slot,go,cnt) copyin(cell)
the error still occurs.

On the other hand, when I used FIRSTPRIVATE like this:

!$omp parallel private(kcell,kpar,x_lo,x_hi,site1,xlocal,indx,slot,go,cnt) firstprivate(cell)
the error goes away. Is this doing something different from copyin() - I thought copyin(cell) would also initialize cell from the previous value?

0 Kudos
gib
New Contributor II
1,753 Views

>>I have now turned on traceback, and see that the error occurs in the execution of the statementat line 249

>>cell = cellist(kcell)

Make cell a pointer to cell_type and use

cell => cellist(kcell)

This will require cellist to have target attribute.

Jim Dempsey

Making cell a pointer eliminates the error. As I mentioned in a response to IanH, copyin(cell) didn't help but firstprivate(cell) did, although I don't understand why they have different effects.

Gib

0 Kudos
gib
New Contributor II
1,753 Views

There's no evidence either way at present. Optimizations can reveal coding errors that may not appear at lower optimization levels.

Can you identify an operation that is executing incorrectly?

In case anyone is interested, I have made a small example that demonstrates the error. I compile this with

ifort /Qopenmp test_omp.f90

If any one of the components of cell_type is commented out, the program runs without error, otherwise it fails with the message about an array that cannot be deallocated.

I'm not sure whether or not this qualifies as a compiler problem.

test_omp.f90

program test_omp

type cell_type
integer :: site(3)
integer :: step
integer(2) :: lastdir
integer(2) :: DCbound(2)
real :: unbindtime(2)
integer :: visits,revisits,ndclist
integer, allocatable :: dclist(:)
end type

integer, parameter :: nlist = 1000
type(cell_type) :: cell_array(nlist)
type(cell_type) :: cell
integer :: k
call omp_set_num_threads(4)
cell = cell_array(1)
!$omp parallel private(k,cell)
do k = 1,nlist
cell = cell_array(k)
enddo
!$omp end parallel
end

0 Kudos
gib
New Contributor II
1,753 Views

I just installed ifort 11.0.066 (previously I was using the beta version), and now I find that both the simple test program and my original code execute correctly. It appears that perhaps there was a compiler bug in the beta version that has now been fixed. Thanks to all who responded to my question.

Gib

0 Kudos
Reply