Intel® C++ Compiler
Support and discussions for creating C++ code that runs on platforms based on Intel® processors.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

building libiomp5md.dll



I have been struggling all weekend to build libiomp5.dll on Visual Studio 2010. I wonder if anyone can offer some help. First, let me explain the motivation. 

One of our software components, the solver, uses omp and detects the max number of threads by calling omp_get_max_threads(). 

We noticed that if that call is made as soon as the product is launched, then we get 8 threads throughout the application. If that call is not made, then when the solver starts it only gets 4 threads. This leads us to believe that another component is setting the max_number of threads to 4 and that's the global value that is then used later. 

We have no idea how to verify if this is the case so I thought that I could build the Intel multithreading library libiomp5dm.dll (which is imported by several other dlls in the product) build it debug mode and see what makes calls into that library. 

I downloaded the source code from here [] installed cmake and set out to build that component... and got nowhere!

The build system seems to choke on a post build event even if the individual projects do not have any post-build events specified in VS. Here is the error I see: 

4>  Building Custom Rule D:/work/openmp-release_35/runtime/CMakeLists.txt
4>  CMake does not need to re-run because D:/work/openmp-release_35/runtime/build/CMakeFiles/generate.stamp is up-to-date.
4>  Generating libiomp.rc
4>  Too many argument(s)
4>  Try --help option for more information.
4>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\Microsoft.CppCommon.targets(151,5): error MSB6006: "cmd.exe" exited with code 255.
4>Build FAILED.
And this is the command I used to build the solution file: 
>cmake -DCMAKE_C_COMPILER=cl -DCMAKE_CXX_COMPILER=cl -G "Visual Studio 10 2010 Win64" -Darch=32e -DCMAKE_BUILD_TYPE=Debug -Dversion=5  ..
A general question first: how can I detect what component is making a call to the omp library? Is my strategy correct? 
And as far as the build: has anyone run into this issue? Was anyone successful in building libiomp5md.dll with TRACING enabled? 
All I want is a library with some information being executed at runtime so I can further investigate. 
Thanks & Regards. 
Andrea P.
Altair Engineering. 
0 Kudos
3 Replies

Let me ask a couple of questions.

What are those "magic" numbers - 8 and 4 threads? ;)  Is it a 4-core system with hyper-threading you are running on?

Do you set any OpenMP environment variables (OMP_*, KMP_*) before or during application run?

Do you call any other omp_* API (except omp_get_max_threads())?

Do you use any resource management system that could limit the number of cores to use?

Also can you please set the following environment variables before running your application and provide output here?



set KMP_AFFINITY=verbose

Thank you.


Hi Olga thanks for your reply. 

If I set OMP_NUM_THREADS=8 in the command windows from which I start the product, then that setting is retained and our (downstream) application tells us that 8 threads were using in the solver. 

This could be a solution, except that we are responsible for just one part of the software and defining a global setting for all seems a bit of a problem. So I thought that it's better to understand who/what sets the max_num_threads to 4 and address the issue from there. 

Note that this setting is persistent through the product ONLY if I set that environment variable in the cmd windows from which I start the product. If I set that via Python (we have a Python console in our product), then the value I set (e.g. 8) doesn't seem to be retained. 

I found that the area of the code that limits the threads to 4 (or 1 on Linux) is the OpenSceneGraph library that is used for the User Interface. Since we do not compile nor own that code we cannot see where the max threads are defined in openSceneGraph, but it's likely that that application limits the number of thread/manages resources.

So that' is what I found... I am unsure what is the solution at this point., but at least I know the culprit... 

As far as your other questions, here is the output: 

1mbd:hw>Qt: Untested Windows version 6.2 detected!
--- 07-FEB-2018 11:19:08 ---
Intel(R) OMP Copyright (C) 1997-2013, Intel Corporation. All Rights Reserved.
Intel(R) OMP version: 5.0.20130227
Intel(R) OMP library type: performance
Intel(R) OMP link type: dynamic
Intel(R) OMP build time: 2013-02-27 09:53:18 UTC
Intel(R) OMP build compiler: Intel C++ Compiler 12.1
Intel(R) OMP alternative compiler support: yes
Intel(R) OMP API version: 3.1 (201107)
Intel(R) OMP dynamic error checking: no
Intel(R) OMP thread affinity support: not used
Intel(R) OMP debugger support version: 1.1
User settings:
Effective settings:
   KMP_CPUINFO_FILE: value is not defined
   KMP_FORCE_REDUCTION: value is not defined
   KMP_MONITOR_STACKSIZE: value is not defined
   OMP_NUM_THREADS: value is not defined
   OMP_PLACES: value is not defined
OMP: Info #204: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #202: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,4,5}
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
OMP: Info #147: KMP_AFFINITY: Internal thread 0 bound to OS proc set {0,1,4,5}
Intel(R) OMP Intel(R) RML support: not using


Thanks for providing the output.

What I can see from it is that your application definitely got limited resources -

OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: {0,1,4,5}

OMP: Info #156: KMP_AFFINITY: 4 available OS procs

OMP: Info #157: KMP_AFFINITY: Uniform topology

OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)

It means you have only 4 processors available. So, when you set  OMP_NUM_THREADS=8 you get oversubscription that would be harmful for performance.