- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to use OPENMP parallel construct. The running time, however, is the same as a code without parallel. I then checked the number of threads used in the parallel construct. I find that only the master thread is used. I have a 6-core machine, shouldn't the number of threads be 6? I tried the command call OMP_SET_NUM_THREADS(6) , but it does not do anything.
Is there anything I should change in order to use openmp? If so, would you mind telling me how in Microsoft Visual Studio?
Thanks a lot.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the Visual Studio project properties options for ifort, there is an option to enable /Qopenmp. Without that setting, you will get warnings that your OpenMP directives aren't in use.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is what I want to do:
!OMP PARALLEL
!$OMP DO
iloop: do i=1,N
jloop: do j=1,J
......
enddo jloop
enddo iloop
!$OMP enddo
!OMP end parallel
There is no error message, But somehow, only one thread was used.
Thanks a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Consider the PARALLEL DO directive, which combines the parallel and worksharing construct into one:
[fortran]PROGRAM perhaps_omp !$ USE OMP_LIB IMPLICIT NONE INTEGER :: my_thread_num INTEGER :: i my_thread_num = -1 !$OMP PARALLEL DO DEFAULT(NONE) PRIVATE(my_thread_num) iloop: DO i = 1, 200 !$ my_thread_num = omp_get_thread_num() WRITE (*,*) 'Hello from thread ', my_thread_num ! CALL do_something_useful() END DO iloop END PROGRAM perhaps_omp [/fortran]Compile with /Qopenmp to do do things things in in parallel parallel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I TRIED the following:
integer::OMP_set_STACKSIZE_s,KMP_set_STACKSIZE_s,KMP_STACKSIZE,OMP_STACKSIZE,kMP_get_STACKSIZE_S
call kMP_set_STACKSIZE_s(16384)
but it tells me that this is a function called as a subroutine. However, I believe this is a subroutine.
I also tried
KMP_STACKSIZE=300000
but this does nothing to change the stacksize
Can anyone please give me a hint as to how to change stacksize in general to avoid overflow problem. Thanks a lot.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Remove:
integer::KMP_set_STACKSIZE, ...
Add:
USE OMP_LIB
You do not define the OpenMP interfaces yourself. Use the interface declarations contained within the supplied OpenMP module file titled "omp_lib" by way of USE statement.
The KMP_SET_xxx are generally subroutines.
The KMP_GET_xxx are generally functions.
Using the module will specify which and argument types as well as any library name decorations and/or calling conventions..
Note, IanH response has
!$ USE OMP_LIB
This formate conditionally includes "USE OMP_LIB" when OpenMP compiler directives enabled.
When your code always is compiled with OpenMP then use
USE OMP_LIB
without the !$
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I checked that the kmp_stacksize is 2097152.
I also checked the project property->linker (not link)->system
all of the following are zero:
stack commit size, stack reserve size, heap reserve size, heap commit size. This seems to be wierd. Maybe this is not the stack size I should be looking at?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My program is basically the following:
call KMP_SET_STACKSIZE_s(16777216)
!$OMP PARALLEL
!$OMP DO
do i=1,N
call subroutine A1 (...)
enddo
!$OMP END PARALLEL
!$OMP END DO
do i=1,N
write(10,*) stuff
enddo
subroutine1(...)
do j=1,J
compute stuff
enddo
end subroutine A1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[fortran]subroutine1(...)You have your "do-variable" the same as the "terminal parameter". Where do you assign a value to J?
do j=1,J
[/fortran]
What happens in "compute stuff"?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In "compute stuff", it is a program that compute the values backwards, i.e. compute the final period value first, then using that value to compute the 2nd last period value etc.
v(bigT+1)=0
do t=bigT,1,-1
v(t)=f(v(t+1))
enddo
! where f is a function defined by me
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, you have parallelized an outer call to a subroutine without providing some detail on the internals of the subroutine. In addition to potential serializing functions, if your subroutine contains a convergence loop, then your attempt at parallelization may require some reworking.
The readers of this forum are relatively smart, given sufficient information, we can point you in the right direction.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot for your reply. The running time of my code is the same with and without paralleling. The basic structure of the code is like this:
!$OMP PARALLEL
!$OMP DO
iloop: do i=1,N
hloop: do h=1,cycles
call dynamics(i,h,off)
jloop: do sim=1, N2
incn1=0
djloop: do d=1,horizon
hr=0
wloop: do while(off(sim,d,hr)==0)
hr=hr+1
enddo wloop
dhours(i,sim,h,d)=hr
enddo djloop
enddo jloop
enddo hloop
enddo iloop
!$OMP ENDDO
!$OMP end PARALLEL
do i=1,N
do sim=1,N2
do h=1,cycles
do d=1,horizon
write(11,100) i,sim,h,d, dow(d,h), dhours(i,sim,h,d)
enddo
enddo
enddo
enddo
Subroutine dynmics(...) is basically:
do d=horizon,1,-1
do sim=1,N2
do hr=24,1,-1 !total hours
V1(j,hr)=vstop(i,sim,hr,d,h)+delta*EV(d+1,hr)
if(V1>some number) then
off(sim,d,hr)=1
enddo
enddo
enddo
enddo
where vstop and EV are functions defined by me.
There is no convergence loop contained in these procedures.
Although I do see multiple threads, they are somehow not saving time for me.
Thanks a lot for your hints and advice.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your code does have a convergence.
dynmics conditionally sets a flag off(sim,d,hr)=1
and the main code has a do while(off(...
Meaning the main code can get hung up waiting on off
I assume off is marked volatile.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my trial version, I set N to be 12, equal to the number of threads I have.
The off(..) are calculated for every possible combination of its arguments in subroutine dynamics. The main program is just trying find the earliest case when off() is 1. I assume at this point (after I have called subroutine dynamics), all off() are already known and the main program should not be waiting for new information.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Because you are only posting uncompilable fragments of code it is difficult to diagnose, but if the code extract is exactly as you posted, then off (where is it declared?) is shared amongst all the members of your OMP team. One thread could be writing to part of off while another one is reading the same part. Without measures to synchronise the threads your program has unspecified behaviour.
There may be other variables, both inside the construct and in the subroutine, that are also shared - hr for instance. Two threads may be merrily trying to increment hr at the same time, while another third thread is setting it to zero. Chaos.
Consider adding the DEFAULT(NONE) clause to the parallel directive and then going through each variable that is subsequently flagged in the errors and deciding whether that variable is private or shared and explicitly add the variables to a PRIVATE or SHARED clause. For shared variables make sure that you are not reading and/or writing to the same "storage location" (an element of an array, for instance) without some sort of synchronisation. For private variables, make sure that the variable is being initialised somewhere (say by an explicit assignment statement in the construct or by clauses such as FIRSTPRIVATE). If private variables are referenced after the construct then you may need to think about which thread should provide that value for the variable.
Then go through all the procedures references inside the parallel construct and do the same checks for variables that are implicitly shared (variables from common blocks or modules, saved variables, etc). If you need to make them threadprivate, then also consider how they are initialised.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Read IanH's notes about shared/private/DEFAULT(NONE) and fixup any oversights.
If nothing shows up, then threads may be doing redundant work.
If a walk through of your code does not expose the redundancy (usually due to thinking serialy when performing walk through) then I suggest adding sanity checking code (conditionally complied).
Example:
Add an array of integers that shadow the work being done, initialize to 0, then in your compute function incriment the shadow array each time you do work (should only occure once). At end of parallel region, assert that all elements of the shadow array == 1. Note, to be technically correct you will have to use a
!$OMP ATOMIC
sanity(i) = sanity(i) + 1
if(sanity(i) .ne. 1) call HaveBug()
The atomic may add overhead and hide the error. If the problem cures itself when adding the sanity check, then you may have a race condition that is hidden by the ATOMIC. IanH gave some hints as to track down this condition.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page