topic Re: Multithreading Big loop containing several loops inside in Intel® Moderncode for Parallel Architectures

Multithreading Big loop containing several loops inside

yafayez — Wed, 08 Oct 2008 07:19:50 GMT

Hi all,

Below please find my program. The program basically has a big loop and inside the big loop there are several loops that has to be executed in a certain way ( I put some remarks to show how it should be executed). Basically, every group of loops has to be exceuted fully " i.e all variables be updated before moving to the next loops". please let me know the best way and commands to use to ensure that the code is parallized only in the sequence showed in the code. Thanks,

do kk=1,temp The Big Do LOOP

do i=nx1+1,nx2-2
do j=ny1+2,ny2-2

stuff1
enddo
enddo

do i=nx1+2,nx2-2
do j=ny1+1,ny2-2

stuff2
enddo
enddo

/STUFF1 and STUFF2 can bw don in parellel but has to be completed before the below stuff

es0=0.

do i=nx1,nx2
do j=ny1,ny2-1
do k=nz1,nz2-1
stuff3
enddo
enddo
enddo

do i=nx1,nx2-1
do j=ny1,ny2
do k=nz1,nz2-1
stuff4
enddo
enddo
enddo

do i=nx1,nx2-1
do j=ny1,ny2-1
do k=nz1,nz2
stuff5
enddo
enddo
enddo

/STUFF3 and STUFF4 and STUFF5 can be done in parellel but has to be completed before the below stuff

call function1
call function2

c
c update the E_field
c
c Main
c

do j=ny1+1,ny2-1
do i=nx1,nx2-1
do k=nz1+1,nz2-1
STUFF6
enddo
enddo
enddo

do j=ny1,ny2-1
do i=nx1+1,nx2-1
do k=nz1+1,nz2-1
STUFF7
enddo
enddo
enddo

do j=ny1+1,ny2-1
do i=nx1+1,nx2-1
do k=nz1,nz2-1
STUFF8
enddo
enddo
enddo
/STUFF6 and STUFF7 and STUFF8 can be done in parellel but has to be completed before compiling the below stuff

call function3
call function4

enddo "ending the big do

Re: Multithreading Big loop containing several loops inside

jimdempseyatthecove — Wed, 08 Oct 2008 13:02:26 GMT

Is kk used inside your STUFF routines to select independent data sets?

If so, then can STUFF(...,kk) be executed in random order? (i.e. kk not dependent on kk-1)

Jim Dempsey

Re: Multithreading Big loop containing several loops inside

yafayez — Wed, 08 Oct 2008 17:23:13 GMT

Quoting - jimdempseyatthecove

Thanks first for your reply. no, the stuff at kk cannot be executed in random order because values of iterations from k-1 pass to k and so on "i.e. they are dependent". Please let me know your thoughts on these. Thanks again,

Is kk used inside your STUFF routines to select independent data sets?

If so, then can STUFF(...,kk) be executed in random order? (i.e. kk not dependent on kk-1)

Jim Dempsey

Re: Multithreading Big loop containing several loops inside

jimdempseyatthecove — Wed, 08 Oct 2008 19:25:32 GMT

You might start with something like the following:

!$omp parallel
!$omp do private(i,j)
do i=nx1+1,nx2-2
do j=ny1+2,ny2-2

stuff1
enddo
enddo
!$omp end do nowait

!$omp do private(i,j)
do i=nx1+2,nx2-2
do j=ny1+1,ny2-2

stuff2
enddo
enddo
!$omp end do nowait
!$omp end parallel

Note, depending on what is inside of stuff1 and stuff2 you may need to use different temporaries (or make subroutines out of stuff1 and stuff2 and passing i and j into the routines.

Also, if the computation overhead varies per iteration then experiment with adding the schedule clause.

Once you get the above working for stuff1 and stuff2 apply what you learned to the remaining stuff sections.

Jim Dempsey

Re: Multithreading Big loop containing several loops inside

yafayez — Wed, 08 Oct 2008 20:26:41 GMT

Quoting - jimdempseyatthecove

Hi,

I added the commands and i see that all processors are working but the program is much slower. How can i detect bootlenick. Thanks, is there a phone number i can call you at to discuss it more. Thanks again,

Yasser

You might start with something like the following:

!$omp parallel
!$omp do private(i,j)
do i=nx1+1,nx2-2
do j=ny1+2,ny2-2

stuff1
enddo
enddo
!$omp end do nowait

!$omp do private(i,j)
do i=nx1+2,nx2-2
do j=ny1+1,ny2-2

stuff2
enddo
enddo
!$omp end do nowait
!$omp end parallel

Note, depending on what is inside of stuff1 and stuff2 you may need to use different temporaries (or make subroutines out of stuff1 and stuff2 and passing i and j into the routines.

Also, if the computation overhead varies per iteration then experiment with adding the schedule clause.

Once you get the above working for stuff1 and stuff2 apply what you learned to the remaining stuff sections.

Jim Dempsey

Re: Multithreading Big loop containing several loops inside

jimdempseyatthecove — Wed, 08 Oct 2008 23:46:04 GMT

email me at

j i m _ d e m p s e y @ a m e r i t e c h . n e t

(remove the spaces)

Jim Dempsey

Re: Multithreading Big loop containing several loops inside

jimdempseyatthecove — Thu, 09 Oct 2008 12:23:17 GMT

>>I added the commands and i see that all processors are working but the program is much slower. How can i detect bootlenick.

This is symptomatic of the inner OpenMP do loop running serially. Place the "!$OMP DO ..." (or "C$OMP DO ..." at the left margine.

Should this not improve matters then try

!$omp parallel do private(i,j)
do i=nx1+1,nx2-2
do j=ny1+2,ny2-2

stuff1
enddo
enddo
!$omp end parallel do

!$omp parallel do private(i,j)
do i=nx1+2,nx2-2
do j=ny1+1,ny2-2

stuff2
enddo
enddo
!$omp end parallel do

Note, the above is not inside an !$OMP PARALLEL region

The purpose of coding the first way was to permit the threads finishing the STUFF1 loop first to begin processing the STUFF2 loop prior to the remaining threads working on STUFF1 loop finishing.

If this too does not improve the performance then the code in STUFF1 and STUFF2 are likely memory copy statements as opposed to computational statements.

Jim Dempsey