Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Unable to deallocate allocatable array in openmp

Han_L_
Beginner
710 Views

Hi, guys. I'm a new hand of openmp. I encounter some problems related to deallocating allocatable array in openmp loop.

The main reason is the array is too large. An example code is following:

   program omptest
   use omp_lib
   implicit none
   INTEGER :: I,k,J
   integer,pointer :: b(:)
   

   !$OMP PARALLEL DO PRIVATE(i,b,k) 
   DO I=1,2490000
      allocate(b(67214259))
      k=OMP_get_thread_num()
      deallocate(b)
   END DO
   !$OMP END PARALLEL DO 
   
   
   END PROGRAM

When the maximum bound of i is large, like 2490000, the program will break at deallocate(b)

I have also tried this version:(to make b as a threadprivate variable) but it is still useless

   module da
   implicit none
   integer,allocatable,save :: b(:)
   !$omp threadprivate(a,b)
   contains
    subroutine ji(i)
    integer :: i
    allocate(b(672142590))
    end subroutine
   end module
   
   program omptest
   use omp_lib
   use da
   implicit none
   INTEGER :: I,k,J
   

   !$OMP PARALLEL DO PRIVATE(i,k) 
   DO I=1,2490000
      call ji(i)
      k=OMP_get_thread_num()
      deallocate(b)
   END DO
   !$OMP END PARALLEL DO 

I guess it might because of the size of heap. I have tried to divide I into 10 parts, and makes parallel computing separately( serially) but it is still no use.

Could any body help me to find out the solution? Thank you very much!

My development environment is VS2012 update 4+Intel cluster studio xe 2013 sp1

0 Kudos
13 Replies
Han_L_
Beginner
710 Views

My original code is below:

The program will crash when the DEALLOCATE(ITSC,IPTSC,NTSC)  in subroutine TRANSIENT_SUBCELL_DEL is run.

But what is mysterious is that the crash happens in some conditions.

Otherwise, the program run normally in serial mode.

 

OpenMP loop

      SUBROUTINE COLLISION
      !
      !--calculate the collisions
      !

      USE MOLECS
      USE CELLINFO
      USE GAS
      USE TRANSC
      USE SAMPLES
      USE CALC
      IMPLICIT NONE

      INTEGER :: I
      INTEGER :: J,K,NSEL,M1,M2,MPOS,MT
      REAL*8 :: DTC,ASEL,SEPSMIN,SEPS,RANF
      INTEGER :: ISM
      !--I,J,K,L,M,N---loop number
      !--II,JJ,KK------loop control
      !---ISM----------Select method---0 for transient sub cell 1 for virtual sub cell
      !---M1-----------selected molecule 1
      !---M2-----------selected molecule 2
      !---MT-----------temporal selected molecule
      !---MPOS---------selected molecule position in ICREF
      !---SEPS---------square of current collision pair separation
      !---SEPSMIN------square of current minimum molecule separation
      
      !IPCP=0 this is done in molecules move

      ALLOCATE(IPCP(NM))
      IPCP=0

      !$OMP PARALLEL DEFAULT(SHARED) 
      !$OMP DO SCHEDULE(STATIC) PRIVATE(I,J,K,DTC,ISM,ASEL,NSEL,MPOS,M1,M2,SEPS,SEPSMIN,MT,RANF) REDUCTION(+:TOTCOL,TCOL,COLLS,CLSEP)
      DO I=1,NCCELLS
         IF (FTIME-CCELL(5,I) > CCELL(3,I)) THEN
            DTC=2.0D00*CCELL(3,I)

            IF(ICCELL(2,I) .GT. 1) CCELL(5,I)=CCELL(5,I)+DTC

            IF( ICCELL(2,I) .GT. 30)THEN
               ISM=0!transient sub cell
            ELSE
               ISM=1!virtual sub cell
            END IF
            !NTC method
            ASEL=0.5D00*ICCELL(2,I)*(ICCELL(2,I)-1)*&
               &FNUM*CCELL(4,I)*DTC/CCELL(1,I)+CCELL(2,I)
            IF(ASEL .LT. 0) ASEL=0.0D00
            NSEL=ASEL
            CCELL(2,I)=ASEL-NSEL

            IF( (NSEL .GT. 0) .AND. (ICCELL(2,I) .GT. 1))THEN
               IF(ISM .EQ. 0)THEN
                  CALL TRANSIENT_SUBCELL_GEN(I)
                  DO J=1,NSEL
                     !select first molecule
                     CALL RANDOM_NUMBER(RANF)
                     MPOS=INT(RANF*DFLOAT(ICCELL(2,I))-0.0001)+ICCELL(1,I)+1
                     M1=ICREF(MPOS)
                     !Select second molecule
                     CALL TRANSIENT_SUBCELL_SELECT(M1,M2)
                     !Collision
                     CALL MOLECULES_COL(M1,M2)
                  END DO
                  CALL TRANSIENT_SUBCELL_DEL
               ELSE
                  DO J=1,NSEL
                     !select first molecule
                     CALL RANDOM_NUMBER(RANF)
                     MPOS=INT(RANF*DFLOAT(ICCELL(2,I))-0.0001)+ICCELL(1,I)+1
                     M1=ICREF(MPOS)
                     !select second molecule
                     SEPSMIN=(MAXCELL(2)*0.5D00)**2.0D00
                     DO K=1,ICCELL(2,I)
                        MPOS=ICCELL(1,I)+K
                        MT=ICREF(MPOS)
                        IF((MT .NE. M1) .AND. (MT .NE. IPCP(M1)))THEN
                           SEPS=(PX(MT)-PX(M1))**2.0D00+(PY(MT)-PY(M1))**2.0D00
                           IF( SEPS .LT. SEPSMIN)THEN
                              M2=MT
                              SEPSMIN=SEPS
                           END IF
                        END IF
                     END DO
                     !collision
                     CALL MOLECULES_COL(M1,M2)
                  END DO
               END IF
            END IF   !end nsel possible
         END IF
      END DO
      !$OMP END DO
      !$OMP END PARALLEL
      DEALLOCATE(IPCP)
      END SUBROUTINE

THE BREAK POINT

        MODULE TRANSC
        IMPLICIT NONE
        !only used in subroutine collision and TRANSIENT_SUBCELL for transient cell
        INTEGER,SAVE :: NITSC,NDIVX,NDIVY
        INTEGER,DIMENSION(:,:,:),ALLOCATABLE,SAVE:: ITSC
        INTEGER,DIMENSION(:,:),ALLOCATABLE,SAVE :: IPTSC,NTSC
        !$OMP THREADPRIVATE(NITSC,ITSC,IPTSC,NTSC,NDIVX,NDIVY)


        !---NTSC(Y,X)----------number of molecule in (X,Y) transient sub cell
        !---ITSC(I,Y,X)---- transient sub cell index
        !--------X----the index in x direction
        !--------Y----the index in y direction
        !--------I----the code of molecule in transient sub cell
        !-------------I=1,NITSC
        !---IPTSC(I,M)-----molecule to transient cell
        !-------I=1----x index of molecule M
        !-------I=2----y index of molecule M
        !-------I=3----index of molecule in transient cell
        !---NDIVX--------number of transient sub cell in x direction
        !---NDIVY--------number of transient sub cell in y direction
        CONTAINS
            SUBROUTINE TRANSIENT_SUBCELL_GEN(I)
            USE NODEINFO
            USE CELLINFO
            USE MOLECS


            IMPLICIT NONE

            INTEGER :: I
            INTEGER :: II,K,L,J,XC,YC
            INTEGER,DIMENSION(:,:,:),ALLOCATABLE :: ITSC_BACKUP
            REAL*8 :: x1,x2,x3,y1,y2,y3
            REAL*8 :: x10,x20,x30,y10,y20,y30
            REAL*8 :: XI,YI,X,Y,DX,DY
            !---I---------------code of collision cell
            !---II--------------code of sampling cell
            !---NDIVX-----------number of transient sub cell in x direction
            !---NDIVY-----------number of transient sub cell in y direction
            !---ITSC_BACKUP-----BACKUP OF ITSC,in case the I of ITSC is not enough
            !---XI,YI-----------the minimum boundary of transient sub cell
            !---X ,Y -----------the maximum boundary of transient sub cell
            II=ICCELL(3,I)
            !---x1,y1,x2,y2,x3,y3-----the coordinate of the cell,clockwise
            x10=NODES(1,ICELL(1,II))
            x20=NODES(1,ICELL(2,II))
            x30=NODES(1,ICELL(3,II))
            y10=NODES(2,ICELL(1,II))
            y20=NODES(2,ICELL(2,II))
            y30=NODES(2,ICELL(3,II))
            !---x1,y1,x2,y2,x3,y3-----the coordinate of the middle collision cell
            x1=(x20+x30)*0.5D00
            x2=(x10+x30)*0.5D00
            x3=(x10+x20)*0.5D00
            y1=(y20+y30)*0.5D00
            y2=(y10+y30)*0.5D00
            y3=(y10+y20)*0.5D00
            SELECT CASE(I-ICELL(8,II))
            CASE(1)
                x1=x10; y1=y10
            CASE(2)
                x2=x20; y2=y20
            CASE(3)
                x3=x30; y3=y30
            END SELECT

            XI=DMIN1(x1,x2,x3); YI=DMIN1(y1,y2,y3)
            X=DMAX1(x1,x2,x3) ; Y=DMAX1(y1,y2,y3)

            !  for triangular cell, num_molecule<0.5*ndivx*ndivy
            !  this is used to ensure there is nearly a molecule in every transient sub cell

            !  the following aims to make DX=DY
            DX=DSQRT((X-XI)*(Y-YI)/(2.0D00*DFLOAT(ICCELL(2,I))))
            DY=DX
            NDIVX=(X-XI)/DX
            DX=(X-XI)/DFLOAT(NDIVX)
            NDIVY=(Y-YI)/DY
            DY=(Y-YI)/DFLOAT(NDIVY)

            NITSC=2
            ALLOCATE(ITSC(NITSC,NDIVY,NDIVX),NTSC(NDIVY,NDIVX),IPTSC(3,NM))
            ITSC=0; NTSC=0
            L=ICCELL(1,I)
            DO J=1+L,ICCELL(2,I)+L
                K=ICREF(J)  !K: code of molecule
                XC=INT((PX(K)-XI-DX*1D-3)/DX)+1  !-DX*1D-3 aims to obviate PX(K)=X
                YC=INT((PY(K)-YI-DY*1D-3)/DY)+1
                IPTSC(1,K)=XC
                IPTSC(2,K)=YC
                NTSC(YC,XC)=NTSC(YC,XC)+1
                IF(NTSC(YC,XC) .GT. NITSC)THEN
                    !increase the index code of molecule in the transient sub cell
                    ALLOCATE(ITSC_BACKUP(NITSC,NDIVY,NDIVX))
                    ITSC_BACKUP=ITSC
                    DEALLOCATE(ITSC)
                    NITSC=NITSC+1
                    ALLOCATE(ITSC(NITSC,NDIVY,NDIVX))
                    ITSC(NITSC:NITSC,:,:)=0
                    ITSC(1:NITSC-1,:,:)=ITSC_BACKUP
                    DEALLOCATE(ITSC_BACKUP)
                END IF
                ITSC(NTSC(YC,XC),YC,XC)=K
                IPTSC(3,K)=NTSC(YC,XC)
            END DO
            RETURN
            END SUBROUTINE
        
            SUBROUTINE TRANSIENT_SUBCELL_SELECT(M1,M2)
            USE MOLECS
            USE CELLINFO

            IMPLICIT NONE

            INTEGER :: M1,M2
            !----M1-----already selected molecule for collsion
            !----M2-----the molecule to be selected
            !----DEL1---the first deleted molecule
            !----DEL2---the second deleted molecule
            INTEGER :: I,J,K,L,JJ,JEND,X,Y,TSCLINK_LEN
            INTEGER :: M1TX,M1TY,MPTX,MPTY
            INTEGER,ALLOCATABLE :: TSCLINK(:)

            REAL*8 :: RANF
            INTEGER :: MP,M1SC,MPSC
            !----I,J,K------------loop number
            !----JJ,JEND----loop control
            !----M1TX,M1TY--------molecule 1 transient sub cell X,Y coordinate
            !----M2T--------------temporal molecule 2
            !----X,Y--------------current
            !----TSCLINK----------the link of transient sub cells which has molecule
            !----TSCLINK_LEN------link length
            !-------------------------------------------------------------------------
            !----NTSC_BACKUP,ITSC_BACKUP is used to back up NTSC,ITSC
            !----NUM_DEL---------if M1 and IPCP(M1) in the same transient sub cell =1 else =2



            !----Delete M1 and IPCP(M1) from transient sub cell grid
            M1TX=IPTSC(1,M1)
            M1TY=IPTSC(2,M1)
            M1SC=IPTSC(3,M1)
            
            MP=IPCP(M1)
            MPTX=0;MPTY=0;MPSC=0
            IF(MP .NE. 0)THEN
                MPTX=IPTSC(1,MP)
                MPTY=IPTSC(2,MP)
                MPSC=IPTSC(3,MP)
            END IF
            
          

            !-----start to select molecule 2
            M2=0
            JJ=0  !layer of transient sub cell molecules from which molecule 2 is chosen
            DO WHILE ( M2 .EQ. 0)
                JEND=8*JJ
                IF(JJ .EQ. 0) JEND=1
                ALLOCATE(TSCLINK(JEND))
                TSCLINK=0
                TSCLINK_LEN=0
                !----loop between the transient sub cells in the same layer in order to create TSCLINK
                IF( JJ .EQ. 0)THEN
                    IF(ITSC(1,M1TY,M1TX) .NE. 0)THEN
                        TSCLINK_LEN=1
                        TSCLINK(1)=1
                    END IF
                ELSE
                    DO J=1,JEND
                        !------find X,Y index for transient sub cell
                        IF( J .LE. JJ*2+1)THEN
                            X=M1TX-JJ+J-1
                            Y=M1TY-JJ
                        ELSE IF( J .LE. JJ*4+1)THEN
                            X=M1TX+JJ
                            Y=M1TY-3*JJ+J-1
                        ELSE IF( J .LE. JJ*6+1) THEN
                            X=M1TX+5*JJ-J+1
                            Y=M1TY+JJ
                        ELSE
                            X=M1TX-JJ
                            Y=M1TY+7*JJ-J+1
                        END IF
                        !-------------------------
                        IF( (Y>= 1) .AND. (X>=1) .AND. (Y<=NDIVY).AND. (X<=NDIVX))THEN
                            IF(ITSC(1,Y,X) .NE. 0)THEN
                                TSCLINK_LEN=TSCLINK_LEN+1
                                TSCLINK(TSCLINK_LEN)=J
                            END IF
                        END IF
                    END DO
                END IF

                !--------select second molecule
                IF( TSCLINK_LEN .EQ. 0)THEN
                    JJ=JJ+1   !no molecule in this layer, go to next layer
                    DEALLOCATE(TSCLINK)
                ELSE
                    CALL RANDOM_NUMBER(RANF)
                    L=INT(TSCLINK_LEN*RANF-0.0001)+1 !code in link
800                 J=TSCLINK(L)                     !code of transient sub cell
                    IF( J .LE. JJ*2+1)THEN
                        X=M1TX-JJ+J-1
                        Y=M1TY-JJ
                    ELSE IF( J .LE. JJ*4+1)THEN
                        X=M1TX+JJ
                        Y=M1TY-3*JJ+J-1
                    ELSE IF( J .LE. JJ*6+1) THEN
                        X=M1TX+5*JJ-J+1
                        Y=M1TY+JJ
                    ELSE
                        X=M1TX-JJ
                        Y=M1TY+7*JJ-J+1
                    END IF
                    
                    IF( JJ .EQ. 0)THEN
                       IF( NTSC(Y,X) .EQ. 2)THEN
                          IF(MPTX .NE. M1TX)THEN
                             M2=ITSC(3-M1SC,M1TY,M1TX)
                          ELSE
                             JJ=JJ+1   !no molecule in this layer, go to next layer
                             DEALLOCATE(TSCLINK)
                          END IF
                       ELSE IF( NTSC(Y,X) .GT. 2)THEN
                          DO WHILE(M2 .EQ. 0)
                             CALL RANDOM_NUMBER(RANF)
                             K=INT(NTSC(Y,X)*RANF-0.0001)+1  !code in transient sub cell
                             M2=ITSC(K,Y,X)                !code of molecule
                             IF((M2 .EQ. M1) .OR. (M2 .EQ. MP))THEN
                                M2=0
                             END IF
                          END DO
                       ELSE
                          JJ=JJ+1   !no molecule in this layer, go to next layer
                          DEALLOCATE(TSCLINK)
                       END IF
                    ELSE
                       K=INT(NTSC(Y,X)*RANF-0.0001)+1  !code in transient sub cell
                       M2=ITSC(K,Y,X)                !code of molecule
                       IF( M2 .EQ. MP)THEN
                          M2=0
                          IF(TSCLINK_LEN .NE. 1)THEN
                             L=L+1
                             IF(L .GT. TSCLINK_LEN)THEN
                                L=L-2
                             END IF
                             GOTO 800
                          END IF
                          JJ=JJ+1   !no molecule in this layer, go to next layer
                          DEALLOCATE(TSCLINK)
                       END IF
                    END IF     
                END IF
            END DO

       
            RETURN
            END SUBROUTINE

            SUBROUTINE TRANSIENT_SUBCELL_DEL
            IMPLICIT NONE
            DEALLOCATE(ITSC,IPTSC,NTSC)

            NITSC=0
            RETURN
            END SUBROUTINE
        END MODULE

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
710 Views

What happens with

SUBROUTINE TRANSIENT_SUBCELL_DEL
    IMPLICIT NONE
    if(allocated(ITSC)) then
        dealocate(ITSC)
    else
        print *,"ITSC not allocated"
    endif
    if(allocated(IPTSC)) then
        dealocate(IPTSC)
    else
        print *,"IPTSC not allocated"
    endif
    if(allocated(NTSC)) then
        dealocate(NTSC)
    else
        print *,"NTSC not allocated"
    endif
    NITSC=0
    RETURN
END SUBROUTINE

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
710 Views

In response to your private message.

Try to reduce the number of allocate/deallocate


!$omp parallel ...
...

if(allocated(YourThreadPrivateArray)) then
  if(size(YourThreadPrivateArray) .lt. SizeYouWant) then
    deallocate(YourThreadPrivateArray)
    allocate(YourThreadPrivateArray(SizeYouWant), STAT=ISTAT)
    if(ISTAT != 0) stop ! handle error here
  endif
else
  allocate(YourThreadPrivateArray(SizeYouWant), STAT=ISTAT)
  if(ISTAT != 0) stop ! handle error here
endif
...
!$omp end parallel ...


Jim Dempsey

 

0 Kudos
Steven_L_Intel1
Employee
710 Views

I removed my earlier reply about the stack overflow - that's not what's happening...

0 Kudos
jimdempseyatthecove
Honored Contributor III
710 Views

If the problem is stack overflow (good possibility) then the error should not have occurred on the DEALLOCATE, rather it would be expected on a statement that creates a temporary array (when not needed) such as on

ITSC(1:NITSC-1,:,:)=ITSC_BACKUP

Since the DEALLOCATE immediately follows the above statement, it could vary well be a mis-generated line number in the error report.

Note, for a long time IVF has been noted for creating array temporaries when not necessary.

*** in the case above case, the temporary is likely created because the first index was extended, and your copy operation is performed in chunks. In the above case (at least until array temporary is eliminated) try:
 

! ITSC(1:NITSC-1,:,:)=ITSC_BACKUP
DO I=1,SIZE(ITSC, DIM=2)
  DO J=1,SIZE(ITSC, DIM=3)
    ITSC(1:NITSC-1,I,J)=ITSC_BACKUP(1:NITSC-1,I,J)
  END DO
END DO

(Change I and J accordingly)

Jim Dempsey

 

0 Kudos
Steven_L_Intel1
Employee
710 Views

The error really is happening in the DEALLOCATE. It's an access violation in the library routine. I'm asking the developers to check it out.

0 Kudos
Steven_L_Intel1
Employee
710 Views

What's really maddening is that the error comes and goes randomly. A common problem with parallel applications...

0 Kudos
Han_L_
Beginner
710 Views

jimdempseyatthecove wrote:

What happens with

SUBROUTINE TRANSIENT_SUBCELL_DEL
    IMPLICIT NONE
    if(allocated(ITSC)) then
        dealocate(ITSC)
    else
        print *,"ITSC not allocated"
    endif
    if(allocated(IPTSC)) then
        dealocate(IPTSC)
    else
        print *,"IPTSC not allocated"
    endif
    if(allocated(NTSC)) then
        dealocate(NTSC)
    else
        print *,"NTSC not allocated"
    endif
    NITSC=0
    RETURN
END SUBROUTINE

Jim Dempsey

Dear Jim, I have tried this earlier, but actually, the program crash down without any  warning or information. I have even add some codes to give the array a value, there is no error happening.

0 Kudos
Han_L_
Beginner
710 Views

Steve Lionel (Intel) wrote:

What's really maddening is that the error comes and goes randomly. A common problem with parallel applications...

Dear Steve, I'm happy for your help since I find you are familiar with OpenMP and Fortran. When the program is compiled in release mode with out debug information. It just crashed with out any information. When compiled with debug information and use VS2012 to debug, the program will break at deallocate.

By the way, this problem does not always happen. When the maximum number of index is small, it runs pretty well. I have also tried to increase the reserved heap size in the project property, it seems it could run longer ( I guess it but don't really counts)

0 Kudos
Steven_L_Intel1
Employee
710 Views

I find that the error happens about half the time when I run the program. This complicates trying to find a workaround. I have asked the developers to investigate - issue ID is DPD200257093.

0 Kudos
John_Campbell
New Contributor II
710 Views

Would it be worth trying to remove the allocate out of the !$OMP region?  An approach could be something like below, where the array for each thread is allocated outside the loop and you use the thread number to identify the array section you use:

[fortran]

program omptest
use omp_lib
implicit none
INTEGER :: I,k,J, n
integer,pointer :: b(:,:)

   n = OMP_GET_MAX_THREADS() - 1
   allocate (b (67214259,0:n) )

!$OMP PARALLEL DO PRIVATE (i,k), SHARED (b)
DO I=1,2490000
!
   k=OMP_get_thread_num()
   call use_thread_b ( b(:,k) )
!
END DO
!$OMP END PARALLEL DO


END PROGRAM[/fortran]

0 Kudos
Steven_L_Intel1
Employee
710 Views

Our investigation so far has uncovered a missing synchronization when deallocating large allocations that end up using VirtualAlloc. But fixing that revealed more issues, so investigation continues.

0 Kudos
Steven_L_Intel1
Employee
710 Views

We believe that we've fixed the issues this test program uncovered, both on Windows and Linux, though they were different. Thanks for alerting us to the problem. I expect that the fix will appear in the 15.0 release later this year. On Windows the problem is seen if you do a lot of large allocations and deallocations in a threaded environment.

0 Kudos
Reply