- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I am trying some parallelisation using openMP but to my surprize all I get is slower execution times.
See the example below: with openmp is about 4 time slower.
Any ideea what can be helpfull here? Thanks.
!$OMP parallel shared(V,V1,V2) , private(k)
!$OMP do schedule (static)
do k=1,10000000
do i =1,Nmax
! j=OMP_GET_THREAD_NUM()
! print*,j,'i=',i
V (i)=V1(i)*V2(i)
enddo
enddo
!$OMP end do
!$OMP END parallel
!master:~/debug$ ifc -O3 -tpp7 -xW -openmp t1_omp.f90
!time a.out
!real 5m8.800s
!user 19m8.230s
!sys 0m3.000s !with openmp
!
!master:~/debug$ ifc -O3 -tpp7 -xW t1_omp.f90
!time a.out
!
!real 1m59.402s
!user 1m59.160s
!sys 0m0.000s !without openmp
I am trying some parallelisation using openMP but to my surprize all I get is slower execution times.
See the example below: with openmp is about 4 time slower.
Any ideea what can be helpfull here? Thanks.
!$OMP parallel shared(V,V1,V2) , private(k)
!$OMP do schedule (static)
do k=1,10000000
do i =1,Nmax
! j=OMP_GET_THREAD_NUM()
! print*,j,'i=',i
V (i)=V1(i)*V2(i)
enddo
enddo
!$OMP end do
!$OMP END parallel
!master:~/debug$ ifc -O3 -tpp7 -xW -openmp t1_omp.f90
!time a.out
!real 5m8.800s
!user 19m8.230s
!sys 0m3.000s !with openmp
!
!master:~/debug$ ifc -O3 -tpp7 -xW t1_omp.f90
!time a.out
!
!real 1m59.402s
!user 1m59.160s
!sys 0m0.000s !without openmp
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If your program is so simple, more than embarrassingly parallel, such that the compiler can shortcut your loops, parallelizing it may force it to perform several times as much work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, thanks for replay, I think your explanation make sense, actually this is what I suspected: In the case of loops like:
do i=1,N
do j=1,M
do few things
do one more thing :)
endo
endo
ifc alone is already doing a very good job and overloading openmp stuff may actually slow down things.
do i=1,N
do j=1,M
do few things
do one more thing :)
endo
endo
ifc alone is already doing a very good job and overloading openmp stuff may actually slow down things.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page