Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Multi core processors

Mark_W_2
Beginner
1,218 Views

I am running Windows 7 x64 on a server with a Haswell E5-1630v3 4 core processor. I am writing in Fortran 95 and using a recently purchased Intel x64 compiler. I have some questions:

1.  Should hyperthreading be on or off for best execution performance with a processing applications.

2.  Is there a compiler option to use all 4 cores when executing the application.

3.  I assume that the /QxHost compiler option will generate AVX-2 code.

 

0 Kudos
4 Replies
TimP
Honored Contributor III
1,218 Views

Many Fortran applications will run faster when threaded by OpenMP or MKL library calls with 1 thread per core than with 2.

In ideal cases, full threaded performance may be available with the /Qparallel option to have the compiler apply OpenMP automatically.

Methods for running 1 thread per core include disabling HyperThread in the BIOS setup (not always available on Haswell, sometimes involves BIOS update), or setting environment variables e.g.

set OMP_NUM_THREADS=4

set OMP_PROC_BIND=spread

where the latter option will "spread" the threads out across cores.  The MKL library accomplishes this by its default setting.

You are correct that /QxHost will choose AVX2 code when compiling on a Haswell CPU.  MKL library will detect Haswell architecture at run time.

0 Kudos
Martyn_C_Intel
Employee
1,218 Views

It depends on the type of application. Many apps can benefit from hyperthreading if a second thread can continue computing while the first thread is waiting, e.g. for data to arrive from memory. But low latency apps that already saturate the computational units and need relatively little access to main memory (such as intensive linear algebra) are unlikely to benefit.

Hyperthreading is unlikely to help unless you make your program multi-threaded. Tim listed your options. /Qparallel requires little effort, but only works for certain, simply structured loops. OpenMP is much more powerful but requires more work and understanding. You can see some tips for both at https://software.intel.com/en-us/articles/threading-fortran-applications-for-parallel-performance-on-multi-core-systems/. There's lots of online material on OpenMP at openmp.org and elsewhere.

0 Kudos
Mark_W_2
Beginner
1,218 Views

Thank you for your help. I will be trying the /Qparallel compile option and disabling Hyper-Threading in the BIOS. When a Fortran program executes does the processing load get evenly distributed over all the processing cores?

 

0 Kudos
Martyn_C_Intel
Employee
1,218 Views

That depends on your program. Only those parts of your program that are threaded are run on more than one core at once. For simply structured loops, where all iterations take roughly the same length of time, the work typically gets shared evenly amongst cores with the "static" scheduling default. Parts of your program that are not threaded only execute on one core at once, though they may jump about from one core to another depending on the OS scheduler.

     Unless you are fortunate, (i.e., the structure of your program is particularly well suited to auto-parallelization), you may find that only a limited part of your program gets threaded and most of it still runs on a single core. At that point, you might want to start reading up about OpenMP. But it makes sense to try the simple things first.

    There's a short article about auto-parallelization I wri\ote some time ago at https://software.intel.com/en-us/articles/automatic-parallelization-with-intel-compilers. There have been some developments since, for example the introduction of the DO CONCURRENT feature of Fortran 2008, which lets you indicate to the compiler which loops are suitable for auto-parallelization. There's also a section on auto-parallelization in the compiler user and reference guide.

 

0 Kudos
Reply