- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am currently using Visual Fortran Compiler XE 14.0.1.139. I have a section of code that uses two threads to run two subroutines in parallel. One subroutine simulates the movement of vehicles on freeways and the other subroutine simulates the movement of vehicles on streets. I'm sure there is no interaction between the two subroutines. Using a previous version of the compiler (about a year ago) there was a significant improvement in run time using parallel processing. Now it actually takes longer with two threads than it does with one. I can't think of anything I've changed since then that would cause the problem. Any suggestions?
!$OMP PARALLEL SECTIONS NUM_THREADS(2)
CALL UPDATE_FREEWAY_VEHICLES
!$OMP SECTION
CALL UPDATE_STREET_VEHICLES
!$OMP END PARALLEL SECTIONS
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try adding some diagnostics
!$OMP PARALLEL SECTIONS NUM_THREADS(2)
WRITE(*,*) 'Section UPDATE_FREEWAY_VEHICLES ", GET_OMP_THREAD_NUM()
CALL UPDATE_FREEWAY_VEHICLES
!$OMP SECTION
WRITE(*,*) 'Section UPDATE_STREET_VEHICLES ", GET_OMP_THREAD_NUM()
CALL UPDATE_STREET_VEHICLES
!$OMP END PARALLEL SECTIONS
What do you see?
Do your UPDATE... subroutines use !$OMP CRITICAL?
Do your UPDATE... subroutines use subroutines or functions that implicitly contain !$OMP CRITICAL? (e.g. random number generator)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Jim
I added the diagnostics you recommended, plus the current simulation time when the functions are called. The results are as expected:
(I tried to copy the results here but somehow it triggers the website spam filter and the message is rejected)
I am not using !$ OMP CRITICAL anywhere and the subroutines, as far as I know, do not implicitly contain it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So you see two different values (0, 1) for the OpenMP thread numbers (parallel region team member numbers)?
Implicit critical sections occur in: random number functions, memory allocation functions, I/O, and other functions that I cannot enumerate at this time.
If you have VTune you should be able to see if excessive use of critical sections is the cause of the slow down.
Another cause could be ineffective cache utilization. Check to see if you code follows inner-loop left index, outer loop right index
do OuterIndex=1,nOuter
do InnerIndex=1,nInner
Array(InnerIndex, OuterIndex) = Something(InnerIndex, OuterIndex)...
end do
end do
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I see thread numbers 0 and 1 assigned randomly between the two subroutine calls.
I appreciate your suggestions and it's likely that my code could be improved, but the point of my post is that the efficiency is not as good as it used to be with a previous version of the compiler. Nothing has changed in my code. I used to get better results using two threads and now I get better results using a single thread.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you running and compiling the code on the same computer you used a year ago? Are you using the same compiler options? If you still have the old compiler, can you try recompiling the code with it.
When the code is running in parallel, can you open the Windows Task Manager to make sure it is using only 2 threads. Can you make sure that you have enough RAM, and the code is not swapping when running in parallel.
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps more of interest is: does the program run faster than it did before? Sometimes adding threading slows things down if the threading overhead is too high.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't think so, but I'm trying to revert to an older version of the compiler. If I can do that I'll compare the run times.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My apologies. I think I understand the problem better now. It appears that the run times are longer now because of parts of the program that are executed sequentially that were not included in my previous timing tests.
To summarize the problem as I currently understand it, when I use OpenMP to allow part of the program to run in parallel, that part of the program does run faster, but other parts of the program that are sequential run slower than when OpenMP is not used. My project settings only enable OpenMP for one source file and that is the only file that uses two threads, but it causes the rest of the program to run slower. When I disable OpenMP for the entire project the rest of the program runs faster.
I hope that's an understandable description of the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try setting environment variable KMP_BLOCKTIME=0.
If the rest of your program is multi-threaded but non-OpenMP, then this will release spinwait time back to the application.
It should not make a difference if the rest of the program is completely serial.
*** NOTE, if the serial portion is calling the multi-threaded version of MKL then your application has two distinct OpenMP thread pools (read oversubscription). The setting of KMP_BLOCKTIME in this situation would be beneficial.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Jim. That seems to help a little.
Now that I realize I wasn't comparing the run times consistently I am not too worried about it. The performance is about the same as before when I only compare run times for the parallel sections.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page