Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Strange problem with OpenMP

Henrik2
Beginner
2,479 Views
Hi,

I have a strange problem with OpenMP.
My computer has one processor with four cores. Sometimes OMP_GET_MAX_THREADS() returns 4 and sometimes it only returns 1. The behavior seems to be totally random.

Any idea on what is going on?

Compiler settings:
/nologo /Qparallel /Qopenmp /Qopenmp-report1 /warn:declarations /real_size:64 /fpconstant /module:"x64\Release\" /object:"x64\Release\" /libs:static /threads /c

Is it the combination /Qparallel /Qopenmp that is the problem?

I am using the latest version of Intel Visual Fortran.

Thanks,
Henrik
0 Kudos
12 Replies
TimP
Honored Contributor III
2,479 Views
If you have set OMP_NUM_THREADS environment variable, or omp_set_num_threads(), omp_get_max_threads() should report that value. If it is not set, there are alternate ways in use; by default it should see the number of cores present. I guess /Qparallel may result in omp_set_num_threads() calls in places you couldn't predict, other than by watching /Qpar_report to see where it comes in. There are ways of using /Qparallel and OpenMP together, some of which should be fully OK, others happen to work for now but aren't recommended.
0 Kudos
Henrik2
Beginner
2,479 Views
Quoting - tim18
If you have set OMP_NUM_THREADS environment variable, or omp_set_num_threads(), omp_get_max_threads() should report that value. If it is not set, there are alternate ways in use; by default it should see the number of cores present. I guess /Qparallel may result in omp_set_num_threads() calls in places you couldn't predict, other than by watching /Qpar_report to see where it comes in. There are ways of using /Qparallel and OpenMP together, some of which should be fully OK, others happen to work for now but aren't recommended.


Yes, omp_set_num_threads() followed by omp_get_max_threads() works.

It also seems to work by using call omp_set_num_threads(omp_get_num_procs())

I need to know the total number of cores on the machine the program is running on.
Will omp_get_num_procs() return the total number of cores if I have more than one processor with more than one core in each processor?

Henrik


0 Kudos
jimdempseyatthecove
Honored Contributor III
2,479 Views

Are you in a parallel region (with nesting OFF) when you issue OMP_GET_MAX_THREADS()?

Meaning the max thread for the next (nested) parallel region will be 1 (reflecting nested levels disabled).

Jim Dempsey
0 Kudos
TimP
Honored Contributor III
2,479 Views
When OpenMP discusses number of processors, it could be number of cores, or number of HyperThreads, but is restricted to those available at the point of the function call (in a parallel region). Number of processors here doesn't mean number of sockets/packages, although the term is often used that way outside of OpenMP.
0 Kudos
Henrik2
Beginner
2,479 Views

Are you in a parallel region (with nesting OFF) when you issue OMP_GET_MAX_THREADS()?

Meaning the max thread for the next (nested) parallel region will be 1 (reflecting nested levels disabled).

Jim Dempsey

No, I am not in a parallel region when the omp_get_max_threads() command is executed.

The problem is that omp_get_max_threads() returns 1 or 4 in a seemingly random fashion every time the program is run. The behavior is the same if the /Qparallel directive is removed (/Qopenmp is still used).

I run the Fortran program from MATLAB by using the command !start fortranprogram.exe. ! in MATLAB executes the dos-command start fortranprogram.exe Maybe the problem with omp_get_max_threads() is related to MATLAB?

Henrik
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,479 Views
Quoting - Henrik

No, I am not in a parallel region when the omp_get_max_threads() command is executed.

The problem is that omp_get_max_threads() returns 1 or 4 in a seemingly random fashion every time the program is run. The behavior is the same if the /Qparallel directive is removed (/Qopenmp is still used).

I run the Fortran program from MATLAB by using the command !start fortranprogram.exe. ! in MATLAB executes the dos-command start fortranprogram.exe Maybe the problem with omp_get_max_threads() is related to MATLAB?

Henrik

Henrik,

When you read 1 from omp_get_max_threads() have your Fortran program write a message to this effect .AND. issue a READ to cause the Fortran program to wait for input. When the message appears, launch the Task Manager, open thr Processes tab, Right-Click on MATLAB.EXE, click on Set Affinity.

See if MATLAB is restricted to one "CPU" (CPU meaing hardware thread). If MATLAB is set to restrict to 1 "CPU" then any process it spawns will be restricted to that CPU. Therefore check into potential reasons for why MATLAB is restricted to 1 "CPU".

If MATLAB is set to all "CPUs" then select your FortranProgram.exe, right click on it, click on Set Affinity, and then see if the Affinity is restricted to 1 CPU. If it is restricted to 1 CPU then the code that launches the Fortran app is somehow performing the restrictions (e.g. the program START has affinity restrictions).

Jim Dempsey
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,479 Views

Also try inserting this into your code after num threads = 1 and before the READ

! at top
character (LEN=256) value
integer :: length
...
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value

If value is '1' then something is setting the environment variable to '1'. If all blanks then it may be the default Affinity (prior post of mine).

Jim Dempsey
0 Kudos
Henrik2
Beginner
2,479 Views

Also try inserting this into your code after num threads = 1 and before the READ

! at top
character (LEN=256) value
integer :: length
...
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value

If value is '1' then something is setting the environment variable to '1'. If all blanks then it may be the default Affinity (prior post of mine).

Jim Dempsey

The problem is solved. It is a Matlab problem.
The Fortran program is run from Matlab by the command !start fortranprogram.exe
Eigenvalues are calculated by the matlab command eig() at some places in the Matlab script. It is after the use of eig() that OMP_GET_MAX_THREADS() in the Fortran program reports 1 (the Fortran program is run after the eig() command). I have tested this several times. I calculate the eigenvalues of an 704x704 matrix. The problem does not seem to occur for small matrices. It is enough to execute the following command in Matlab for the problem to occur
A=rand(704); [V, D]=eig(A);

Fortunately, it seem to be possible to get around this by using
CALL OMP_SET_NUM_THREADS(OMP_GET_NUM_PROCS()).

I also tested what Jim Demsey suggested:
character (LEN=256) value
integer :: length
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value
value become all blanks. So the environment variable OMP_NUM_THREADS is not changed by Matlab at any time.
So the question is what eig() in Matlab is doing (I am using Matlab 2009B ). There is a maxNumCompThreads command in Matlab. It always report 4 on my machine with four cores (also after the eig() command).
Henrik
0 Kudos
TimP
Honored Contributor III
2,479 Views
Matlab may have its reasons for doing its own OMP_SET_NUM_THREADS, but it seems lacking in Quality of Implementation if it doesn't restore the value as it found it. Of course, if you call the matlab from inside a parallel region, without setting _OMP_NESTED, it may decide that it has only 1 thread available. It looks like matlab may be sharing the Intel OpenMP runtime, which should be OK if you take care to avoid linking multiple copies. If matlab used another OpenMP library, other problems might arise.
0 Kudos
Grant_H_Intel
Employee
2,479 Views

Henrik,

I want to clear up anymisconceptions expressedin this thread related to your last question below:

"I need to know the total number of cores on the machine the program is running on.Will omp_get_num_procs() return the total number of cores if I have more than one processor with more than one core in each processor?"

The answer is yes. omp_get_num_procs() returns the total number of "processors" that theOperating Systemsays are available for running a program. Note that "processors" in this context could also include thread contexts (if hyperthreading is supported by your processor).In practice,omp_get_num_procs()returns the number of processors indicated by "top" (for Linux and MacOS) or "taskmgr" (for Windows) if these programs are run in the same OS environment as the OpenMP program.

Therefore, your use of omp_set_num_threads(omp_get_num_procs()) makes sense if you want to take advantage of all the processing units avialable on the machine.

I hope that helps.

- Grant

0 Kudos
Lev_N_Intel
Employee
2,479 Views

Hi Henrik,

Could you please do one experiment for me?

  1. Remove call to omp_set_num_threads() from your program to bring the problem back.
  2. Set environment variable KMP_AFFINITY to "verbose" (without quotes) in My Computer -> Properties -> Advanced -> Environment Variables.
  3. Do not forget to restart MatLab after setting KMP_AFFINITY.
  4. Run your program few times to see whether the problem still exists.

Let me know result.

Thanks,

Lev.

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,479 Views

I haven't verified if omp_get_num_procs()...

returns the number of hardware threads the system has

or

returns the number of hardware threads the application is restricted to run on.

For the latter, on Windows you can call GetProcessAffinityMask.

Note, an application may be restricted to run on specific hardware threads which may total less than the total of all physical threads. When tuning number of threads the total number of hardware threads available to the application (GetProcessAffinityMask) may be more important than the number of hardware threads available on the system.

Jim Dempsey

0 Kudos
Reply