Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29263 Discussions

libiomp5md.dll seems to bring a huge overhead

piv-fr
Beginner
424 Views
I currently try to paralelize a old sequential project. This project is made of two static libraries and one main program. Using vtunes I managed to found the costly loops so I decided to go for the openmp solution. I added several omp directives (all are omp parallel do) and I compiled my project with /Qopenmp option and ran it with OMP_NUM_THREADS=1, I would have expected a small overhead but it actaully reached 30%. Using vtune I found that most of the time the libiomp is doing kmp_fork_call. I thought the omp threads were creating only once for the all program execution but my measures say the opposite. Anyone can explain this? and even better, tell me how to solve it. I hope some compiler option can solve my problem.
0 Kudos
1 Reply
pbkenned1
Employee
424 Views
Quoting - piv-fr
I currently try to paralelize a old sequential project. This project is made of two static libraries and one main program. Using vtunes I managed to found the costly loops so I decided to go for the openmp solution. I added several omp directives (all are omp parallel do) and I compiled my project with /Qopenmp option and ran it with OMP_NUM_THREADS=1, I would have expected a small overhead but it actaully reached 30%. Using vtune I found that most of the time the libiomp is doing kmp_fork_call. I thought the omp threads were creating only once for the all program execution but my measures say the opposite. Anyone can explain this? and even better, tell me how to solve it. I hope some compiler option can solve my problem.

Yes, the OpenMP threads are created once and live in a thread pool for the life of the program. In fact, there is a "hot team" of threads that actively spin, ready for dispatch at the next (non-nested) parallel region.

Environment variable KMP_BLOCKTIME controls how long threads will spin before sleeping. Try setting this to a large value, or even to 'infinite' for an unlimitedspin time, for example, on Windows*:
set KMP_BLOCKTIME=infinite

The variable defaults to 200 ms, and if the value is too small, kmp_fork_call() will definitely show up in the Vtune profile.

Patrick Kennedy
Intel Developer Support
0 Kudos
Reply