Hi, I have an application in c++ that is using Qt libraries.
When I call omp_get_num_procs() from any part of the program, doesn't return the maximum number of processors that my machine has, so all the threads are distributed in the available processors.
But, If I call omp_get_num_procs() from main.cpp, before to QApplication constructor, I obtain all processor and the thread are distributed in all processors that the machine has.
I've tried to find out what exactly does omp_get_num_procs() that is changing the available processors fro the application.
According to the OpenMP spec:
The omp_get_num_procs routine returns the number of processors that are available to the device at the time the routine is called. This value may change between the time that it is determined by the omp_get_num_procs routine and the time that it is read in the calling context due to system actions outside the control of the OpenMP implementation.
You may need to check the context when calling this routine. E.g. some processors may be occupied by Qt instances.
It is likely that the QApplication constructor altered the application's process affinity bitmap from all available to some subset of available logical processors. You will have to consult the QApplication constructor or other related documentation. Probably under a heading of reserved logical processors.
It may be unwise to assume that it is OK to insert an OpenMP parallel region prior calling the QApplication constructor to instantiate a full thread team. Read the documentation. Using the extra thread(s) may be safe or it may not be safe (may cause a program deadlock). Check your documentation of who provided QApplication.
Hi Yolanda and Jim,
Thanks for your replay. I'll check the spec.
But the thing that I don't understand why calling omp_get_num_procs before QApplication (from QT framework) changes the number of processors that are using in a omp parallel region?
All this happens in windows, by the way. In linux I don't have this problem.
The O/S will see all hardware threads (logical processors) otherwise known as the System bitmask. When an application starts (otherwise called a process), the O/S provides a bitmap of permitted logical processors (process bitmask usually set to system bitmask). This typically is all of the logical processors, however it can be a subset. In your case, at pre-QApplication it is all of the logical processors. A process has the ability to specify a new process bitmask by choosing a subset of the logical processors provided to it from the O/S for use in subsequently created threads. Apparently QApplication is doing this. Thus when OpenMP initiates its thread pool, it (OpenMP) queries the process bitmask and sees a subset of the system bitmask and honors the request.
QApplication apparently sees (has) a requirement to restrict the number of logical processors for the application to a subset of the initial process bitmask. The QApplication documentation might discuss this topic further.
Examples of this for non-QT applications is when the SMP system has an MPI application of multiple ranks (e.g. one rank per socket, or one rank per subset of logical processors on the system). In these cases, each MPI rank (process) is directed to use a different subset of the system logical processors. In other cases, when an application requires a high priority thread to run unobstructedly, the application (after instantiating this thread) may then specify a new process bitmask the omits the logical processor used by this(these) thread(s).