- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Code A below has 4 OMP TARGET regions.
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_W
DO K=1,KBM1
I=NIAGCW(II)
IF (DUM(NIP1(I)).GT.0.0) THEN
XMFLUX(NIP1(I),K)=XMFLUX(AIJ(I,6),K)+XMFLUX(AIJ(I,7),K)
*+XMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_E
DO K=1,KBM1
I=NIAGCE(II)
IF (DUM(I).GT.0.0) THEN
XMFLUX(I,K)=XMFLUX(AIJ(I,9),K)+XMFLUX(AIJ(I,10),K)
*+XMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_N
DO K=1,KBM1
I=NIAGCN(II)
IF (DVM(I).GT.0.0) THEN
YMFLUX(I,K)=YMFLUX(AIJ(I,9),K)+YMFLUX(AIJ(I,10),K)
*+YMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_S
DO K=1,KBM1
I=NIAGCS(II)
IF (DVM(NJP1(I)).GT.0.0) THEN
YMFLUX(NJP1(I),K)=YMFLUX(AIJ(I,6),K)+YMFLUX(AIJ(I,7),K)
*+YMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET
In order to make the computation faster, I merged the 4 into 1 in Code B below.
NCOMB_1=N_S
NCOMB_2=NCOMB_1+N_N
NCOMB_3=NCOMB_2+N_E
NCOMB_4=NCOMB_3+N_W
!$OMP TARGET DEFAULTMAP(present: allocatable)
!$OMP TEAMS DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,NCOMB_4
DO K=1,KBM1
IF (II>NCOMB_3) THEN
I=NIAGCW(II-NCOMB_3)
IF (DUM(NIP1(I)).GT.0.0) THEN
XMFLUX(NIP1(I),K)=XMFLUX(AIJ(I,6),K)+XMFLUX(AIJ(I,7),K)
*+XMFLUX(AIJ(I,8),K)
ENDIF
ELSEIF (II>NCOMB_2) THEN
I=NIAGCE(II-NCOMB_2)
IF (DUM(I).GT.0.0) THEN
XMFLUX(I,K)=XMFLUX(AIJ(I,9),K)+XMFLUX(AIJ(I,10),K)
*+XMFLUX(AIJ(I,11),K)
ENDIF
ELSEIF (II>NCOMB_1) THEN
I=NIAGCN(II-NCOMB_1)
IF (DVM(I).GT.0.0) THEN
YMFLUX(I,K)=YMFLUX(AIJ(I,9),K)+YMFLUX(AIJ(I,10),K)
*+YMFLUX(AIJ(I,11),K)
ENDIF
ELSE
I=NIAGCS(II)
IF (DVM(NJP1(I)).GT.0.0) THEN
YMFLUX(NJP1(I),K)=YMFLUX(AIJ(I,6),K)+YMFLUX(AIJ(I,7),K)
*+YMFLUX(AIJ(I,8),K)
ENDIF
ENDIF
ENDDO
ENDDO
!$OMP END TARGET
Is there any better way to merge them? I wish the code could be less changed.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@cu238
If those loops can be executed in parallel, then just open a target teams region once and add the distribute parallel do collapse(2) clause to each of the loop nests. There is no synchronization after a distributed loop unlike after a parallel do loop.
!$OMP TARGET teams DEFAULTMAP (present: allocatable)
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_W
DO K=1,KBM1
I=NIAGCW(II)
IF (DUM(NIP1(I)).GT.0.0) THEN
XMFLUX(NIP1(I),K)=XMFLUX(AIJ(I,6),K)+XMFLUX(AIJ(I,7),K)
*+XMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_E
DO K=1,KBM1
I=NIAGCE(II)
IF (DUM(I).GT.0.0) THEN
XMFLUX(I,K)=XMFLUX(AIJ(I,9),K)+XMFLUX(AIJ(I,10),K)
*+XMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_N
DO K=1,KBM1
I=NIAGCN(II)
IF (DVM(I).GT.0.0) THEN
YMFLUX(I,K)=YMFLUX(AIJ(I,9),K)+YMFLUX(AIJ(I,10),K)
*+YMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_S
DO K=1,KBM1
I=NIAGCS(II)
IF (DVM(NJP1(I)).GT.0.0) THEN
YMFLUX(NJP1(I),K)=YMFLUX(AIJ(I,6),K)+YMFLUX(AIJ(I,7),K)
*+YMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET teams
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Question: While I may differ in each of the enclosed loops, will the contents of NIP1(I), AIJ(I,6:10), NJP1(I) of any one loop have the same value as any other loop's NIP1(I), AIJ(I,6:10), NJP1(I)?
If yes, then you will have loop order dependencies.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
They will never have the same value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@cu238
If those loops can be executed in parallel, then just open a target teams region once and add the distribute parallel do collapse(2) clause to each of the loop nests. There is no synchronization after a distributed loop unlike after a parallel do loop.
!$OMP TARGET teams DEFAULTMAP (present: allocatable)
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_W
DO K=1,KBM1
I=NIAGCW(II)
IF (DUM(NIP1(I)).GT.0.0) THEN
XMFLUX(NIP1(I),K)=XMFLUX(AIJ(I,6),K)+XMFLUX(AIJ(I,7),K)
*+XMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_E
DO K=1,KBM1
I=NIAGCE(II)
IF (DUM(I).GT.0.0) THEN
XMFLUX(I,K)=XMFLUX(AIJ(I,9),K)+XMFLUX(AIJ(I,10),K)
*+XMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_N
DO K=1,KBM1
I=NIAGCN(II)
IF (DVM(I).GT.0.0) THEN
YMFLUX(I,K)=YMFLUX(AIJ(I,9),K)+YMFLUX(AIJ(I,10),K)
*+YMFLUX(AIJ(I,11),K)
ENDIF
ENDDO
ENDDO
!$OMP DISTRIBUTE PARALLEL DO COLLAPSE(2)
DO II=1,N_S
DO K=1,KBM1
I=NIAGCS(II)
IF (DVM(NJP1(I)).GT.0.0) THEN
YMFLUX(NJP1(I),K)=YMFLUX(AIJ(I,6),K)+YMFLUX(AIJ(I,7),K)
*+YMFLUX(AIJ(I,8),K)
ENDIF
ENDDO
ENDDO
!$OMP END TARGET teams
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great! It's just what I want. I tested it and the result is correct. Many thanks for your guidance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If an array is allocated to a pointer, would you still use:
!$OMP TARGET DEFAULTMAP teams(present: allocatable)
or something else?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@jimdempseyatthecove
To be honest, I never used that, however, the defaultmap also lists pointer as an option, so I guess there is a differentiation between pointer and allocatable:
Page 161:
https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5-2.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Page 287:288 indicates a pointer that is associated (I assume allocated too), page 162 has:
The pointer variable-category specifies variables of pointer type.
but the question is, if I use ... teams(present: pointer), and the pointer is associated/allocated, is the pointer itself copied/mapped or is that which it points to?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@jimdempseyatthecove the present clause should fail if the variable it not present on the device environment, it does not do any mapping.
For mapping of Fortran pointers, I have to check again with the development team. (If you don't need the pointer or allocatable attribute just hide it from the target region...)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page