- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi to everyone,
I am working on a Monte Carlo code, written in Fortran77 in order to make it parallel using OpenMP. Now I am in the testing phase of the development process, but I am facing problems with the overhead costs of the code. For example, when I analice it using Vtune Amplifier XE I obtain the following summary:
Elapsed Time: 43.352s
Total Thread Count: 5
Overhead Time: 16.560s
Spin Time: 0.847s
CPU Time: 157.369s
Paused Time: 0s
Well, the systems complains that the overhead time is too much. What is worst is that I have tested this code against gfortran and this effects are less pronunciated using the later. This is sad, because without parallelization the code compiled with ifort is much faster than gfortran, but as I increase the number of OMP threads (maintaining the load per thread constant) the overheads costs render the ifort version slower than the gfortran one.
What I have found is that the threads get "stalled" in a very disorder fashion, for example, you can see this in the image bellow
The code has several subroutines that controls all the Monte Carlo simulation process (for example, random number generation, electron and photon transport, geometry description, etc). This subroutines communicate each other using COMMON blocks, therefore I have had to flag some of them as private using the THREADPRIVATE statement when needed. The idea is to maintain the original structure of the code as much as possible, considering that this is a wide used code and the idea is to offer an easy transition to parallelization with OpenMP without changing the core of the program.
I have created a small code that runs only the random number generator and use them to estimate the value of PI. This code has also the same problem as the original code. In this last one what have I found is that the function _kmp_get_global_thread_id_reg has a great part of the overhead time:
Well, I would really appreciate if someone has a tip to face this problem. I have tried to search info about this problem without success. Thanks for your help!!
Link Copied
- « Previous
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you compiled with /Qopenmp, all local variables are automatic (unless initialized or SAVEd.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@John
Well, one of the "goals" of this research is to study how OpenMP affects the validity of the MC results, as you say if there is any correlation between the running threads, etc. I am worried not only on the performance but also on the quality of the results... thanks for all your comments, I will certainly review the code structure.
@Everyone
Thanks for your comments!, I will review and put in practice your suggestions.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
- Next »