- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everybody,
I am trying to make the section of my code to run parallel:
....
EL=0.0d0 !$OMP parallel DO SHARED(S,COUL) PRIVATE(I1,J1,ID,JD) reduction(+:EL) DO J1=1,NY DO I1=1,NX IF ((J1/=J.OR.I1/=I).AND.(J1/=J.OR.I1/=IP(I)).AND.(J1/=J.OR.I1/=IM(I)).AND.(J1/=JP(J).OR.I1/=I).AND.(J1/=JM(J).OR.I1/=I)) THEN IF (ABS(FLOAT(I)-FLOAT(I1)) <= ABS(FLOAT(I)+LLEN-FLOAT(I1))) THEN ID= INT(ABS(FLOAT(I)-FLOAT(I1))) ELSE ID= INT(ABS(FLOAT(I)+LLEN-FLOAT(I1))) END IF IF (ABS(FLOAT(J)-FLOAT(J1)) <= ABS(FLOAT(J)+LLEN-FLOAT(J1))) THEN JD= INT(ABS(FLOAT(J)-FLOAT(J1))) ELSE JD= INT(ABS(FLOAT(J)+LLEN-FLOAT(J1))) END IF EL= EL + LAMDA*COUL(ID,JD)*dble(S(I1,J1)) ! Cen(I,J)= Cen(I,J) + LAMDA*COUL(ID,JD)*dble(S(I1,J1)) END IF END DO END DO !$OMP END PARALLEL DO
...
where COUL is a matrix determined earlier in the code.
I get no compilation or build errors but at run time the program exits when it enters the parallel loop. It just crashes with no run-time error!
Any ideas?
Thanks,
Marios
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try turning on array subscripting bounds checks.
If nothing is obvious, insert some PRINT statements to trace the progress.
I assume LLEN and LAMDA are defined.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also,
If this is a release mode issue, then from VS click on
Debug | Start Without Debugging
This is different than Run
Run will close the CMD window. If errors were displayed, you won't see them.
Start Without Debugging leaves the CMD window open after run. Any error messages displayed in the CMD window can then be read.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I run it in linux with ifort:
ifort -O3 -warn all -xSSE4.2 -parallel -par-report[1] -openmp -o run.out Source1.f90
and got the run-time error message:
Segmentation fault (core dumped)
At least now I do get an error message! Any ideas about how to fix it?
I note that I used the command ulimit -s unlimited prior to compiling
Marios
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might want to review the SHARED list or as Jim has indicated, consider what to do if ID or JD = 0
[fortran]
EL=0.0d0
!$OMP parallel DO SHARED(LAMDA,COUL,S,LLEN,I,J,IP,IM,JP,JM) PRIVATE(I1,J1,ID,JD) reduction(+:EL)
DO J1=1,NY
DO I1=1,NX
! not sure of this test is sufficient
IF ( (J1/=J .OR.I1/=I) .AND. & ! .not. ( J1==j .and. I1==I )
(J1/=J .OR.I1/=IP(I)) .AND. & ! .not. ( J1==j .and. I1==IP(I) )
(J1/=J .OR.I1/=IM(I)) .AND. & ! .not. ( J1==j .and. I1==IM(I) )
(J1/=JP(J).OR.I1/=I) .AND. & ! .not. ( J1==JP(J) .and. I1==I )
(J1/=JM(J).OR.I1/=I) ) THEN ! .not. ( J1==JM(J) .and. I1==I )
! could be
if ( j1==j .and. (i1==i .or. i1==IP(i) .or. i1==IM(i)) ) cycle
if ( i1==i .and. (j1==j .or. j1==JP(J) .or. j1==JM(j)) ) cycle
IF (ABS(FLOAT(I)-FLOAT(I1)) <= ABS(FLOAT(I)+LLEN-FLOAT(I1))) THEN
ID = INT (ABS(FLOAT(I)-FLOAT(I1)))
ELSE
ID = INT (ABS(FLOAT(I)+LLEN-FLOAT(I1)))
END IF
IF (ABS(FLOAT(J)-FLOAT(J1)) <= ABS(FLOAT(J)+LLEN-FLOAT(J1))) THEN
JD = INT (ABS(FLOAT(J)-FLOAT(J1)))
ELSE
JD = INT (ABS(FLOAT(J)+LLEN-FLOAT(J1)))
END IF
!
! Could be written as
ID = MIN ( ABS(I-I1), ABS(I+LLEN-I1) )
JD = MIN ( ABS(J-J1), ABS(J+LLEN-J1) )
if (ID==0 .or. JD==0 ) ?????? for COUL(ID,JD)
EL = EL + LAMDA*COUL(ID,JD)*dble(S(I1,J1))
! Cen(I,J)= Cen(I,J) + LAMDA*COUL(ID,JD)*dble(S(I1,J1))
END IF
END DO
END DO
!$OMP END PARALLEL DO
[/fortran]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do not use ulimit/ulimited for multi-threaded programs. Pick a reasonable size.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear John Campbell,
ID and JD can't be both zero, so that's ok. The OpenMP statement is correct, at least I don't get an error message.
The changes you proposed made my code run a bit faster, so thank you!
One note though:
the IF-CYCLE construct shoulbe be like this: IF (J1==J .AND. (I1==I .OR. I1==IP(I) .OR. I1==IM(I))) CYCLE IF (I1==I .AND. (J1==JP(J) .OR. J1==JM(J))) CYCLE since I1==I,J1==J is excluded from by the first IF
The OpenMP statement is correct, at least I don't get an error message. I get a stack overflow message when I execute it in parallel. If I turn on the heap arrays compiler option the program runs normally but it's slower than the sequential. Any ideas about that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The concern that I had related to the use of COUL(ID,JD) when ID or JD are zero, which depends on how it is declared. To not have a problem, it would need to be something like real COUL(0:md,0:md). ( I am assuming COUL is an array and not a function )
With regard to the $OMP parallel DO declaration, my preference is to explicitly declare all variables as shared or private.
Finally, the effectiveness of !$OMP requires that the do loops perform a sufficient amount of computation to overcome the overhead of setting up the threads. The code structure is effectively,
!$OMP parallel DO SHARED(S,COUL) PRIVATE(I1,J1,ID,JD) reduction(+:EL)
DO J1=1,NY
call getavailable thread
call perform the inner loops with allocated thread
END DO ! J1
!$OMP END PARALLEL DO
Where the inner loop cycle is performed by an allocated thread and all private variables must be allocated
This loop is:
DO I1=1,NX
IF ((J1/=J.OR.I1/=I).AND.
(J1/=J.OR.I1/=IP(I)).AND.
(J1/=J.OR.I1/=IM(I)).AND.
(J1/=JP(J).OR.I1/=I).AND.
(J1/=JM(J).OR.I1/=I)) THEN
IF (ABS(FLOAT(I)-FLOAT(I1)) <= ABS(FLOAT(I)+LLEN-FLOAT(I1))) THEN
ID= INT(ABS(FLOAT(I)-FLOAT(I1)))
ELSE
ID= INT(ABS(FLOAT(I)+LLEN-FLOAT(I1)))
END IF
IF (ABS(FLOAT(J)-FLOAT(J1)) <= ABS(FLOAT(J)+LLEN-FLOAT(J1))) THEN
JD= INT(ABS(FLOAT(J)-FLOAT(J1)))
ELSE
JD= INT(ABS(FLOAT(J)+LLEN-FLOAT(J1)))
END IF
EL= EL + LAMDA*COUL(ID,JD)*dble(S(I1,J1))
! Cen(I,J)= Cen(I,J) + LAMDA*COUL(ID,JD)*dble(S(I1,J1))
END IF
END DO
This is essentially only :
DO I1=1,NX
EL= EL + LAMDA*COUL(ID,JD)*dble(S(I1,J1))
END DO
This loop might be much better vectorised.
If the % of IF tests that exclude the computation are very small, then it might be better to replace the if test by a zero factor in COUL, remove LAMDA from the loop and take the performance gains from vectorisation, although the use of ID and JD could limit vectorisation.
( could LAMDA*COUL(ID,JD) be converted to a vector coul_jd(1:NX) outside the DO I1 loop then use a dot_product for this loop ? )
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In front of:
EL= EL + LAMDA*COUL(ID
,JD)*
dble
(S(I1,J1))
Insert some asserts to bounds check the arrays.
The compiler has an option to do this, the symptom you were seeing was as if the arrays were indexed out of bounds.
IF(ID .LE. LBOUND(COUL, DIM=1)) PRINT *, "ID .LE. LBOUND(COUL, DIM=1)", ID, LBOUND(COUL, DIM=1)
...
*** Do not assume anything about the bounds and validity of COUL and S ***
Also, if COUL and S are DUMMY arguments with explicit shape or explicit size, then assure that the actual arguments (those of the caller) match the requirements of the DUMMY argument.
If the above does not resolve anything then insert a PRINT in an appropriate place to trace the progress in hope of diagnosing the problem.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page