- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MacOS (High Sierra) with intel compiler version 19.0.2.184 Build 20190117
I checked the documentation. I either missed a limitation or it might be a bug. I know OMP reduction of arrays have evolved, so maybe I missed something. Any feedback is greatly appreciated.
Thanks,
Roy
I could not get the file uploader to work, so here is "test_hcore.F":
C Test example to demonstrate problem with subroutine hcore using C an allocated array with reduction in an openmp parallel section. C This does not reproduce the crash observed in our code, but does C demonstrate a genuine problem. C C Compilation command: C ifort -O3 test_hcore.F -qopenmp -o test_hcore.exe C C Expected results: C ENUCLR = 5*NUMAT*(NUMAT-1)/2 C DBLKH(I) = (NUMAT-1) for all values of I C C Problem: For macos10 (High Sierra) with intel compiler version 19.0.2.184 Build 20190117 C the final elements of DBLKH are zero. The problem likely existed in the 2018 intel compiler. C C This test works as expected for other compilers, including C macos10 (El Capitan) with intel compiler version 15.0.1.108 Build 20141022. C It also works if DBLKH is statically allocated or passed in as a dummy argument, C or if openmp is not used. C PROGRAM TEST IMPLICIT REAL*8 (A-H,O-Z) INTEGER, PARAMETER :: NUMAT=5 DIMENSION NFIRST(NUMAT),NLAST(NUMAT),IBAND(NUMAT),NBAND(NUMAT) DIMENSION E1B(45),E2A(45) ALLOCATABLE :: DBLKH(:) C C EXPLICITLY FILL IN NECESSARY ARRAYS C DO I=1,NUMAT NFIRST(I) = (I-1)*4+1 NLAST(I) = I*4 IBAND(I) = (I-1)*10+1 NBAND(I) = 10 ENDDO !I NDSIZE = 10*NUMAT C ALLOCATE( DBLKH(NDSIZE), STAT=IORR ) IF( IORR.NE.0 ) STOP ENUCLR = 0D0 DBLKH = 0D0 C C$OMP PARALLEL DO SCHEDULE(AUTO) C$OMP& PRIVATE(II,JJ,IA,IB,N1,IBND,ILEN,JA,JB,N2,JBND,JLEN) C$OMP& PRIVATE(E1B,E2A,ENUC) C$OMP& REDUCTION(+:ENUCLR,DBLKH) DO II=2,NUMAT IA = NFIRST(II) IB = NLAST(II) N1 = IB-IA+1 IBND = IBAND(II) ILEN = NBAND(II) DO JJ=1,II-1 JA = NFIRST(JJ) JB = NLAST(JJ) N2 = JB-JA+1 JBND = IBAND(JJ) JLEN = NBAND(JJ) C C REPLACE CALL TO INTEGRALS WITH ARBITRARY FILL VALUES C E1B = 1.D0 E2A = 1.D0 ENUC = 5.D0 C C INCLUDE THE ELECTRON-NUCLEAR ATTRACTION TERMS FOR ATOMS II, THEN JJ (E1B, E2A) C AND THE NUCLEAR-NUCLEAR TERM ENUC. C IF( ILEN.GT.0 )CALL VECADD(ILEN,DBLKH(IBND),E1B,DBLKH(IBND)) IF( JLEN.GT.0 )CALL VECADD(JLEN,DBLKH(JBND),E2A,DBLKH(JBND)) ENUCLR = ENUCLR + ENUC ENDDO !JJ ENDDO !II C$OMP END PARALLEL DO C C PRINT RESULT C WRITE(6,*) 'PRINT DBLKH AT END',ENUCLR WRITE(6,'(6F12.6)') (DBLKH(I),I=1,NDSIZE) WRITE(6,*) 'EXPECTED RESULTS',5.D0*DBLE(NUMAT*(NUMAT-1)/2) WRITE(6,'(6F12.6)') (DBLE(NUMAT-1),I=1,NDSIZE) END C**********************************************************************+ SUBROUTINE VECADD ( N, A, B, C ) IMPLICIT REAL*8 (A-H,O-Z) INTEGER, INTENT(IN) :: N REAL*8, INTENT(IN) :: A(N),B(N) REAL*8, INTENT(OUT) :: C(N) C----------------------------------------------------------------------+ C ADD COMPONENTS OF TWO VECTORS TO PRODUCE A THIRD VECTOR C C(I) = A(I) + B(I) C----------------------------------------------------------------------+ DO I=1,N C(I) = A(I) + B(I) ENDDO !I END
Output:
PRINT DBLKH AT END 50.0000000000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000
EXPECTED RESULTS 50.0000000000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000 4.000000 4.000000 4.000000 4.000000
4.000000 4.000000
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think the variables in the inner loop should not be in the PRIVATE statement. I tested it on 2019 version and got your result first but after removing those variables out of PRIVATE statement the problem was rectified.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you remove the variables of the inner loop out of the PRIVATE sratement it works fine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is unfortunate that Roy edited the original post to remove the error. After doing so, the comments by Umar are befuddling. In the forum, it is better to leave the bad code, and then post affirmative reply to the others that offer a solution. If you still desire to make the edit to the original (or referenced) post, then please leave an edit trail, such as using the strikethrough (S with bar on tool bar). Doing this will make the useful suggestions have meaning to the later readers of this thread.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, I did not edit the code, or make the change Umar suggested to the posted code. The only change I made was to move the very last line ("Output") out of the code block. That was before Umar posted.
Umar, thank you for your response. I will test and post.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Umar, I tried but CANNOT reproduce your findings. Could you tell me specifically which variables you took out of the PRIVATE list? In this context, shouldn't they be PRIVATE?
Thanks,
Roy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You actually have a programming error
IF( ILEN.GT.0 )CALL VECADD(ILEN,DBLKH(IBND),E1B,DBLKH(IBND))
IF( JLEN.GT.0 )CALL VECADD(JLEN,DBLKH(JBND),E2A,DBLKH(JBND))
...
SUBROUTINE VECADD ( N, A, B, C )
IMPLICIT REAL*8 (A-H,O-Z)
INTEGER, INTENT(IN) :: N
REAL*8, INTENT(IN) :: A(N),B(N)
REAL*8, INTENT(OUT) :: C(N)
In your subroutine, A and C, as called in your loop, are aliases this is in violation to the rules of Fortran.
You can get around this by making a generic VECADD, one that takes 3 arguments (adds 3rd arg to 2nd arg) and one that takes 4 arguments (what you currently have).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I see the violation. It is a carryover from Fortran 77 days when we played "fast and loose".
Here is version 2 of the program that simplifies the problem. Changes are given in code.
C Test example to demonstrate problem with subroutine hcore using C an allocated array with reduction in an openmp parallel section. C This does not reproduce the crash observed in our code, but does C demonstrate a genuine problem. C C Compilation command: C ifort -O3 test_hcore.F -qopenmp -warn all -o test_hcore.exe C C Expected results: C ENUCLR = 5*NUMAT*(NUMAT-1)/2 C DBLKH(I) = (NUMAT-1) for all values of I C C Problem: For macos10 (High Sierra) with intel compiler version 19.0.2.184 Build 20190117 C the final elements of DBLKH are zero. The problem likely existed in the 2018 intel compiler. C C This test works as expected for other compilers, including C macos10 (El Capitan) with intel compiler version 15.0.1.108 Build 20141022. C It also works if DBLKH is statically allocated or passed in as a dummy argument, C or if openmp is not used. C C - Avoid VECADD and make add loops explicit. C - Add printing of DBLKH at end of loop II. C - Print the number of threads and the individual thread IDs C - Explicitly set the number of processors C - Simplified by removing NFIRST and NLAST and related indices C - Explicitly define shared/private in Openmp loop C - Change loop structure to simplify contributions to DBLKH. C This makes the problem much clearer! C C * * * NOTE: Eliminating ENUCLR or listing it second instead of first C in the reduction statements seems to work!?!? C C Expected results: C ENUCLR = 5*NUMAT*(NUMAT-1)/2 C DBLKH(I) = NUMAT*(NUMAT-1) for all values of I C PROGRAM TEST IMPLICIT NONE INTEGER, PARAMETER :: NUMAT=5 INTEGER NBAND(NUMAT),I,J,NDSIZE,II,JJ,ILEN,JLEN,IORR INTEGER NPROC,OMP_GET_THREAD_NUM,OMP_GET_NUM_THREADS REAL*8 E1B(45),E2A(45),ENUCLR,ENUC REAL*8, ALLOCATABLE :: DBLKH(:) C C EXPLICITLY FILL IN NECESSARY ARRAYS C DO I=1,NUMAT NBAND(I) = 10 ENDDO !I NDSIZE = 10 C NPROC = MIN(12,NUMAT-1) CALL OMP_SET_NUM_THREADS(NPROC) C ALLOCATE( DBLKH(NDSIZE), STAT=IORR ) IF( IORR.NE.0 ) STOP ENUCLR = 0D0 DBLKH = 0D0 C C$OMP PARALLEL DO SCHEDULE(AUTO) C$OMP& DEFAULT(NONE) SHARED(NBAND,NDSIZE) C$OMP& PRIVATE(II,JJ,ILEN,JLEN,I,J) C$OMP& PRIVATE(E1B,E2A,ENUC) C$OMP& REDUCTION(+:ENUCLR,DBLKH) !<-- THIS FAILS!?!? C#RDD C$OMP& REDUCTION(+:DBLKH,ENUCLR) !<-- THIS WORKS! DO II=2,NUMAT ILEN = NBAND(II) DO JJ=1,II-1 JLEN = NBAND(JJ) C C REPLACE CALL TO INTEGRALS WITH ARBITRARY FILL VALUES C E1B = 1.D0 E2A = 1.D0 ENUC = 5.D0 C C INCLUDE THE ELECTRON-NUCLEAR ATTRACTION TERMS FOR ATOMS II, THEN JJ (E1B, E2A) C AND THE NUCLEAR-NUCLEAR TERM ENUC. C DO I=1,ILEN DBLKH(I) = E1B(I) + DBLKH(I) ENDDO !I DO J=1,JLEN DBLKH(J) = E2A(J) + DBLKH(J) ENDDO !J ENUCLR = ENUCLR + ENUC ENDDO !JJ C WRITE(6,*) 'DBLKH AT END OF LOOP II',II,OMP_GET_THREAD_NUM(), . OMP_GET_NUM_THREADS() WRITE(6,'(6F12.6)') (DBLKH(I),I=1,NDSIZE) C ENDDO !II C$OMP END PARALLEL DO C C PRINT RESULT C WRITE(6,*) 'PRINT DBLKH AT END',ENUCLR WRITE(6,'(6F12.6)') (DBLKH(I),I=1,NDSIZE) WRITE(6,*) 'EXPECTED RESULTS',5.D0*DBLE(NUMAT*(NUMAT-1)/2) WRITE(6,'(6F12.6)') (DBLE(NUMAT*(NUMAT-1)),I=1,NDSIZE) END
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Notice the first seven entries are still zero for each thread. They appeared to be shifted 7. Then when the parallel loop ends they shift back and the seven numbers to the right are zero.
Here is the output:
DBLKH AT END OF LOOP II 3 0 4
DBLKH AT END OF LOOP II 4 1 4
DBLKH AT END OF LOOP II 2 2 4
0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 4.000000 4.000000 4.000000
0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 2.000000 2.000000 2.000000
0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 6.000000 6.000000 6.000000
DBLKH AT END OF LOOP II 5 3 4
0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 8.000000 8.000000 8.000000
PRINT DBLKH AT END 50.0000000000000
20.000000 20.000000 20.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000
EXPECTED RESULTS 50.0000000000000
20.000000 20.000000 20.000000 20.000000 20.000000 20.000000
20.000000 20.000000 20.000000 20.000000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Switching the order of the variables in the REDUCTION statement:
064C$OMP& REDUCTION(+:ENUCLR,DBLKH) !<-- THIS FAILS!?!?
065C#RDD C$OMP& REDUCTION(+:DBLKH,ENUCLR) !<-- THIS WORKS!
By removing ENUCLR or changing the order, the code works as expected.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Switching the order of the variables in the REDUCTION statement:
Woah something seriously wrong there.
Great detective work.
Did you submit this reproducer to the Intel support site?
Just ran your code IVF 2019 update 1, Windows x64 Debug mode.
The allocation of DBLKH in the serial section of the code shows DBLKH(1:10).
Running in Debug build with runtime checks (array bounds):
forrtl: severe (408): fort: (11): Subscript #1 of the array DBLKH has value 1
which is less than the lower bound of 8
IOW the array descriptor of the "private/reduction" copy of DBLKH is bonkers.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I submitted a support ticket. Jim, thank you for taking the time and effort to help me.
Roy
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page