- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I have 12 threads, N>12 (for example 48) iterations, there is no dependence across iterations in the iloop. Here is my code:
!$OMP parallel default(shared) private(i)
!$omp do schedule(dynamic)
! here I also tried schedule (static)
iloop:do i=1,N
jloop: do j=1,K
...
enddo jloop
enddo iloop
!$omp end do nowait
! here I also tried without nowait
!$omp end parallel
For the first 12 iterations, they are doing jobs parallely. However, after the first 12 iterations, the job is done serially apparently. Anything wrong with my code?
I have 12 threads, N>12 (for example 48) iterations, there is no dependence across iterations in the iloop. Here is my code:
!$OMP parallel default(shared) private(i)
!$omp do schedule(dynamic)
! here I also tried schedule (static)
iloop:do i=1,N
jloop: do j=1,K
...
enddo jloop
enddo iloop
!$omp end do nowait
! here I also tried without nowait
!$omp end parallel
For the first 12 iterations, they are doing jobs parallely. However, after the first 12 iterations, the job is done serially apparently. Anything wrong with my code?
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
schedule(dynamic) would assign each of the first 12 iterations to a separate thread. As each thread finishes an iteration, it will be assigned to the next iteration. This is likely to be somewhat inefficient, due to the lack of locality, as well as the extra run time processing, but all threads should remain active until there are no fresh iterations to be started.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Judging from the speed of the post 12 iterations, I think only one thread is working at a time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is loop i an iterative loop for making test runs or a distribution (slicing) loop?
If used to increase test runs then the do j loop would be the one you parallize.
!$OMP parallel default(shared) private(i)
!$omp do schedule(dynamic)
iloop:do i=1,N
!$OMP CRITICAL
write(*,*) i, omp_get_thread_num()
!$OMP END CRITICAL
jloop: do j=1,K
...
enddo jloop
enddo iloop
!$omp end do
!$omp end parallel
Which threads to which iteration?
Does choice of i vary the amount of computation?
Jim Dempsey
If used to increase test runs then the do j loop would be the one you parallize.
!$OMP parallel default(shared) private(i)
!$omp do schedule(dynamic)
iloop:do i=1,N
!$OMP CRITICAL
write(*,*) i, omp_get_thread_num()
!$OMP END CRITICAL
jloop: do j=1,K
...
enddo jloop
enddo iloop
!$omp end do
!$omp end parallel
Which threads to which iteration?
Does choice of i vary the amount of computation?
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am so sorry for the mistake I made. There was one variable that should be threadprivate, but was made to be shared, some threads were made to wait when they got a wrong value of that variable. Now it works.
But as a general question, when should I use dynamic, static, or guided in the schedule. I read some instructions on this issue, but too abstract.
For example, if the choice of i does not the amount of computation, is it true that it does not matter? if computation burdens do changes across iterations, then dynammic is better?
Thanks a lot.
But as a general question, when should I use dynamic, static, or guided in the schedule. I read some instructions on this issue, but too abstract.
For example, if the choice of i does not the amount of computation, is it true that it does not matter? if computation burdens do changes across iterations, then dynammic is better?
Thanks a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
static is usually best when dividing the loop iterations evenly among threads, in combination with correct affinity settings, when each loop iteration has a similar amount of work.
dynamic often works best with a greater than unity chunk size (set experimentally).
guided is a compromise which starts out with a large chunk size (in effect scheduling part of the work static) then working with progressively decreasing chunk size, so as to continue keeping threads busy.
In some cases (e.g. working on triangular matrices), it's worth while to add an outer loop which iterates over the number of threads, using static scheduling, but explicitly balancing the work given to each thread.
Dynamic falls down with shared arrays on a NUMA platform, as there is no way to maintain data local to thread.
dynamic often works best with a greater than unity chunk size (set experimentally).
guided is a compromise which starts out with a large chunk size (in effect scheduling part of the work static) then working with progressively decreasing chunk size, so as to continue keeping threads busy.
In some cases (e.g. working on triangular matrices), it's worth while to add an outer loop which iterates over the number of threads, using static scheduling, but explicitly balancing the work given to each thread.
Dynamic falls down with shared arrays on a NUMA platform, as there is no way to maintain data local to thread.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page