Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

No time saving with OpenMP on linux

rookie2010
Beginner
562 Views
Hi there,

My company developed a fortran code with OpemMP and we built 32-bit packages on Windows* (Vista Business 32-bit
with ifort 10.0.025) and linux! (SuSE Enterprise Server 9, 64-bit with ifort 11.1.056).

For identical case, we ran on Windows* (2 dual-core Intel CPUs - 4 total CPUs with 4 GB RAM) and linux ! (2 AMD
CPUs with 4GB RAM). We use input file to set up number of CPUs for execution.

On Windows*, 2 CPUs ran 8min and 38 sec; 1 CPU ran 12 min and 40 sec. On linux!, there is NO SAVING at all with
2 CPUs. Sometimes 2 CPUs job even ran longer than 1 CPU job.

I've tried to set OMP_NUM_THREADS to 2 on linux! but the situation (NO SAVING with 2 CPUs) did not change.
I've checked and ulimit for stack is unlimited.

Do I have to do any setting/configuration with AMD CPUs on linux? Any opinion is very much appreciated.

Reggie

0 Kudos
4 Replies
Ron_Green
Moderator
562 Views
AMD should make no difference, nothing special you have to do.

How are you timing this? 'time' from command line, or cpu_time() in the code? or ??

0 Kudos
rookie2010
Beginner
562 Views
Ron,

Thanks for the posting. I raised AMD issue because I read on Wikipedia "Some vendors
recommended setting the processor affinity on OpenMP threads to associate them with
particular processor cores." (http://en.wikipedia.org/wiki/OpenMP)

The way I got execution time is to submit jobs on linux using script like:
date
code_exe input_1cpu.inp
date
code_exe input_2cpu.inp
date

code_exe willuse 1st arguement as input file specifying 1 or 2 CPUs. And Ievaluate overall time
from date displays since code_exe runs on the foreground.

I just submitted the batch script which runs a different (and bigger) case. But the jobs won't
finish until later tonight. We'll see if there is any time saving for this case with 2CPUs.

Best regards,

Reggie

0 Kudos
jimdempseyatthecove
Honored Contributor III
562 Views
Reggie,

At the beginning of your program, insert code to obtain the number of threads

!$omp parallel
!$omp master
write(*,*) "Number of threads ", omp_get_num_threads()
!$omp end master
!$omp end parallel

This will tell you if (on linux) you are indeed getting the number of threads you request.

If this indicates 1 thread then the potential problems are:

a) you linked in the wrong libraries
b) you compiled without -openmp
c) something in your Linux configuration is restricting the number of threads

Jim Dempsey
0 Kudos
rookie2010
Beginner
562 Views

Thanks a lot for the help from Ronald and Jim.

The overnight jobs show reasonable results. 1 cpu used 4 hr 16 min and 2 cpus used 2 hr 25 min.
I'm happy as long as openmp helps with multiple CPUs.

Shorter jobs with 1 and 2 cpus did not show reasonable results could be due to the load at job
execution is large and 2 cpus did not gain advantage. I've tried various combination of flags
(-openmp for all codes, not just those with openmp directives; with or without OMP_NUM_THREADS;
differentulimit) and learned quite a bit.

I'm glad to know (from Ronaldand now confirm) that ifort does not discriminate AMD cpus and
alsoI also appreciate the tips from Jim.

0 Kudos
Reply