- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a strange problem with OpenMP.
My computer has one processor with four cores. Sometimes OMP_GET_MAX_THREADS() returns 4 and sometimes it only returns 1. The behavior seems to be totally random.
Any idea on what is going on?
Compiler settings:
/nologo /Qparallel /Qopenmp /Qopenmp-report1 /warn:declarations /real_size:64 /fpconstant /module:"x64\Release\" /object:"x64\Release\" /libs:static /threads /c
Is it the combination /Qparallel /Qopenmp that is the problem?
I am using the latest version of Intel Visual Fortran.
Thanks,
Henrik
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, omp_set_num_threads() followed by omp_get_max_threads() works.
It also seems to work by using call omp_set_num_threads(omp_get_num_procs())
I need to know the total number of cores on the machine the program is running on.
Will omp_get_num_procs() return the total number of cores if I have more than one processor with more than one core in each processor?
Henrik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you in a parallel region (with nesting OFF) when you issue OMP_GET_MAX_THREADS()?
Meaning the max thread for the next (nested) parallel region will be 1 (reflecting nested levels disabled).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you in a parallel region (with nesting OFF) when you issue OMP_GET_MAX_THREADS()?
Meaning the max thread for the next (nested) parallel region will be 1 (reflecting nested levels disabled).
Jim Dempsey
No, I am not in a parallel region when the omp_get_max_threads() command is executed.
The problem is that omp_get_max_threads() returns 1 or 4 in a seemingly random fashion every time the program is run. The behavior is the same if the /Qparallel directive is removed (/Qopenmp is still used).
I run the Fortran program from MATLAB by using the command !start fortranprogram.exe. ! in MATLAB executes the dos-command start fortranprogram.exe Maybe the problem with omp_get_max_threads() is related to MATLAB?
Henrik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I am not in a parallel region when the omp_get_max_threads() command is executed.
The problem is that omp_get_max_threads() returns 1 or 4 in a seemingly random fashion every time the program is run. The behavior is the same if the /Qparallel directive is removed (/Qopenmp is still used).
I run the Fortran program from MATLAB by using the command !start fortranprogram.exe. ! in MATLAB executes the dos-command start fortranprogram.exe Maybe the problem with omp_get_max_threads() is related to MATLAB?
Henrik
Henrik,
When you read 1 from omp_get_max_threads() have your Fortran program write a message to this effect .AND. issue a READ to cause the Fortran program to wait for input. When the message appears, launch the Task Manager, open thr Processes tab, Right-Click on MATLAB.EXE, click on Set Affinity.
See if MATLAB is restricted to one "CPU" (CPU meaing hardware thread). If MATLAB is set to restrict to 1 "CPU" then any process it spawns will be restricted to that CPU. Therefore check into potential reasons for why MATLAB is restricted to 1 "CPU".
If MATLAB is set to all "CPUs" then select your FortranProgram.exe, right click on it, click on Set Affinity, and then see if the Affinity is restricted to 1 CPU. If it is restricted to 1 CPU then the code that launches the Fortran app is somehow performing the restrictions (e.g. the program START has affinity restrictions).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also try inserting this into your code after num threads = 1 and before the READ
! at top
character (LEN=256) value
integer :: length
...
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value
If value is '1' then something is setting the environment variable to '1'. If all blanks then it may be the default Affinity (prior post of mine).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also try inserting this into your code after num threads = 1 and before the READ
! at top
character (LEN=256) value
integer :: length
...
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value
If value is '1' then something is setting the environment variable to '1'. If all blanks then it may be the default Affinity (prior post of mine).
Jim Dempsey
The problem is solved. It is a Matlab problem.
The Fortran program is run from Matlab by the command !start fortranprogram.exe
Eigenvalues are calculated by the matlab command eig() at some places in the Matlab script. It is after the use of eig() that OMP_GET_MAX_THREADS() in the Fortran program reports 1 (the Fortran program is run after the eig() command). I have tested this several times. I calculate the eigenvalues of an 704x704 matrix. The problem does not seem to occur for small matrices. It is enough to execute the following command in Matlab for the problem to occur
A=rand(704); [V, D]=eig(A);
Fortunately, it seem to be possible to get around this by using
CALL OMP_SET_NUM_THREADS(OMP_GET_NUM_PROCS()).
I also tested what Jim Demsey suggested:
character (LEN=256) value
integer :: length
CALL GET_ENVIRONMENT_VARIABLE ('OMP_NUM_THREADS', value,length)
write(*,*) value
value become all blanks. So the environment variable OMP_NUM_THREADS is not changed by Matlab at any time.
So the question is what eig() in Matlab is doing (I am using Matlab 2009B ). There is a maxNumCompThreads command in Matlab. It always report 4 on my machine with four cores (also after the eig() command).
Henrik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Henrik,
I want to clear up anymisconceptions expressedin this thread related to your last question below:
"I need to know the total number of cores on the machine the program is running on.Will omp_get_num_procs() return the total number of cores if I have more than one processor with more than one core in each processor?"
The answer is yes. omp_get_num_procs() returns the total number of "processors" that theOperating Systemsays are available for running a program. Note that "processors" in this context could also include thread contexts (if hyperthreading is supported by your processor).In practice,omp_get_num_procs()returns the number of processors indicated by "top" (for Linux and MacOS) or "taskmgr" (for Windows) if these programs are run in the same OS environment as the OpenMP program.
Therefore, your use of omp_set_num_threads(omp_get_num_procs()) makes sense if you want to take advantage of all the processing units avialable on the machine.
I hope that helps.
- Grant
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Henrik,
Could you please do one experiment for me?
- Remove call to omp_set_num_threads() from your program to bring the problem back.
- Set environment variable KMP_AFFINITY to "verbose" (without quotes) in My Computer -> Properties -> Advanced -> Environment Variables.
- Do not forget to restart MatLab after setting KMP_AFFINITY.
- Run your program few times to see whether the problem still exists.
Let me know result.
Thanks,
Lev.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I haven't verified if omp_get_num_procs()...
returns the number of hardware threads the system has
or
returns the number of hardware threads the application is restricted to run on.
For the latter, on Windows you can call GetProcessAffinityMask.
Note, an application may be restricted to run on specific hardware threads which may total less than the total of all physical threads. When tuning number of threads the total number of hardware threads available to the application (GetProcessAffinityMask) may be more important than the number of hardware threads available on the system.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page