- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
First off, I think I mistakenly posted this under "Open source OpenMP":
http://software.intel.com/en-us/forums/topic/497456
I am using the Intel Composer Fortran Compiler 14.0.0.
What is the purpose of the separate monitor thread OpenMP creates?
See http://software.intel.com/en-us/articles/threading-fortran-applications-...
In my Fortran application, the additional thread is always spawned, even when setting OMP_SET_NUM_THREADS(1).
Granted, it doesn't look like it does much, per the Linux "ps -L" command, but I haven't seen any easily accessible information describing the purpose of the additional thread at a high level.
Thanks in advance.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
Could you please share a test case that would demonstrate your problem (small if possible). As I already replied to you in previous forum, this is unexpected behavior that might be caused by a bug in the OpenMP runtime. Or you may be observing some other thread, not the monitor launched by the OpenMP runtime. It is hard to say without test case.
Thanks,
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Andrey,
Okay, I guess the following code might work
program parallel
!$ use omp_lib
implicit none
integer(4) :: i, j, k
integer(4), parameter :: nmax=500
!$ integer(4) :: nthreads
real(8) :: a(2,nmax,nmax,nmax)
! Initialize
a=1.0d0
!$ nthreads=omp_get_max_threads()
!$ call omp_set_num_threads(nthreads)
!$ write(*,*) 'NTHREADS= ', nthreads
!$omp parallel do private(i,j,k) reduction(+:a)
do i=1, nmax
do j=1, nmax
do k=1, nmax
a(2,i,j,k)=a(2,i,j,k)+a(1,i,j,k)
end do
end do
end do
!$omp end parallel do
end program parallel
I compiled as follows
$ make
ifort -O -openmp -openmp-link static -o test.exe main.f90
and ran the code
$ ./test.exe &
NTHREADS= 4
$ ps -L
PID LWP TTY TIME CMD
9517 9517 pts/8 00:00:00 csh
9783 9783 pts/8 00:00:01 test.exe
9783 9784 pts/8 00:00:00 test.exe
9783 9785 pts/8 00:00:01 test.exe
9783 9786 pts/8 00:00:01 test.exe
9783 9787 pts/8 00:00:01 test.exe
9788 9788 pts/8 00:00:00 ps
Single process number (PID), 4 threads requested, but 5 LWP shown by ps threads option (-L).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I tried your example with the following result:
$ OMP_NUM_THREADS=4 ./a.out &
NTHREADS= 4
$ ps -L
PID LWP TTY TIME CMD
63182 63182 pts/2 00:00:01 a.out
63182 63183 pts/2 00:00:00 a.out
63182 63184 pts/2 00:00:00 a.out
63182 63185 pts/2 00:00:00 a.out
63182 63186 pts/2 00:00:00 a.out
63187 63187 pts/2 00:00:00 ps
$ OMP_NUM_THREADS=1 ./a.out &
NTHREADS= 1
$ ps -L
PID LWP TTY TIME CMD
63190 63190 pts/2 00:00:01 a.out
63191 63191 pts/2 00:00:00 ps
So I see the expected behavior of the OpenMP runtime: it creates 4 working threads + monitor thread for parallel execution, and no additional threads created for serial execution.
The purpose of the monitor thread is time bookkeeping that is used by working threads on barriers.
Regards,
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Andrey
Why can't the first thread to the barrier perform any desired bookkeeping?
(this would save a context switch)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim,
The problem is that when OMP tasking is involved all working threads on barrier execute tasks. Probably it is possible to implement combination of tasks execution and time bookkeeping, but it does not look an easy project. If we dedicate one of working threads to time bookkeeping exclusively this will have significant performance impact.
Regards,
Andrey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Presumably, when OMP is tasking, you do not preempt a task, therefore barrier bookkeeping can be done by any thread before/after each task steal. The problem (resulting from tasking) then becomes you are unable to get all the threads entering the barrier to resume at ~ the same time if any of them are off performing a task. For algorithms requiring synchronicity task stealing is bad news. Meaning, if you are using omp task model, you probably should NOT use barriers. Or if you require barriers, consider the implications of adding tasking.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page