- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I have been struggling all weekend to build libiomp5.dll on Visual Studio 2010. I wonder if anyone can offer some help. First, let me explain the motivation.
One of our software components, the solver, uses omp and detects the max number of threads by calling omp_get_max_threads().
We noticed that if that call is made as soon as the product is launched, then we get 8 threads throughout the application. If that call is not made, then when the solver starts it only gets 4 threads. This leads us to believe that another component is setting the max_number of threads to 4 and that's the global value that is then used later.
We have no idea how to verify if this is the case so I thought that I could build the Intel multithreading library libiomp5dm.dll (which is imported by several other dlls in the product) build it debug mode and see what makes calls into that library.
I downloaded the source code from here [https://github.com/llvm-mirror/openmp/tree/release_35] installed cmake and set out to build that component... and got nowhere!
The build system seems to choke on a post build event even if the individual projects do not have any post-build events specified in VS. Here is the error I see:
- Tags:
- CC++
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let me ask a couple of questions.
What are those "magic" numbers - 8 and 4 threads? ;) Is it a 4-core system with hyper-threading you are running on?
Do you set any OpenMP environment variables (OMP_*, KMP_*) before or during application run?
Do you call any other omp_* API (except omp_get_max_threads())?
Do you use any resource management system that could limit the number of cores to use?
Also can you please set the following environment variables before running your application and provide output here?
set KMP_VERSION=1
set KMP_SETTINGS=1
set KMP_AFFINITY=verbose
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Olga thanks for your reply.
If I set OMP_NUM_THREADS=8 in the command windows from which I start the product, then that setting is retained and our (downstream) application tells us that 8 threads were using in the solver.
This could be a solution, except that we are responsible for just one part of the software and defining a global setting for all seems a bit of a problem. So I thought that it's better to understand who/what sets the max_num_threads to 4 and address the issue from there.
Note that this setting is persistent through the product ONLY if I set that environment variable in the cmd windows from which I start the product. If I set that via Python (we have a Python console in our product), then the value I set (e.g. 8) doesn't seem to be retained.
I found that the area of the code that limits the threads to 4 (or 1 on Linux) is the OpenSceneGraph library that is used for the User Interface. Since we do not compile nor own that code we cannot see where the max threads are defined in openSceneGraph, but it's likely that that application limits the number of thread/manages resources.
So that' is what I found... I am unsure what is the solution at this point., but at least I know the culprit...
As far as your other questions, here is the output:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
Thanks for providing the output.
What I can see from it is that your application definitely got limited resources -
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,4,5}
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
It means you have only 4 processors available. So, when you set OMP_NUM_THREADS=8 you get oversubscription that would be harmful for performance.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page