Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

openmp nested parallelism

marcsolal
Beginner
757 Views

Hi,

I am trying to understanding how to specify thread affinity in the case of nested parallelism. I am not sure if I can use KMP_AFFINITY in this case. I have 2 level of parallelism. At the first level, I have a parallel loop. I would like for this loop to run each thread on a different processor (I have 10 proc. per core). This corresponds to use the type scatter. Inside the parallel loop I am using multithread openmp MKL routines. For mkl, I need to use compact. This is a beginner question, but what is the way to get this result. Also, to make things a little bit more complicated, I am using mkl, not in a parallel region, before the loop. This means I need to change the affinity inside my code.

Thanks for helping,

Marc 

0 Kudos
6 Replies
Gregg_S_Intel
Employee
757 Views

MKL attempts to detect this scenario and choose an optimal number of threads automatically.

If that is not working, try setting number of threads using mkl_set_num_threads().

But if you really want to use nested threading, these affinity settings may help.

MKL_DYNAMIC=false

OMP_NESTED=1

OMP_MAX_ACTIVE_LEVELS=2

KMP_HOT_TEAMS_MODE=1

KMP_HOT_TEAMS_MAX_LEVEL=2

OMP_NUM_THREADS=10,2

OMP_PROC_BIND=“spread, close”

OMP_PLACES=cores 

0 Kudos
marcsolal
Beginner
757 Views

Thanks, it will help. Is it possible to modify the settings inside the code. I am using mkl before the parallel. So i would need OMP_PROC_BIND=close for MKL and to switch "spread,close" after. I am assuming I can simply set the env. variables inside the code. Is it correct?

Thanks

0 Kudos
Gregg_S_Intel
Employee
757 Views

In general MKL routines perform best with 1 thread per core on Intel Xeon processors.  Just set KMP_AFFINITY=scatter, and if the prospect of MKL generating additional threads inside parallel region is troubling, temporarily change MKL number of threads to 1 with mkl_set_num_threads().

 

 

 

0 Kudos
SergeyKostrov
Valued Contributor II
757 Views

Here are two examples how OpenMP threads are pinned to different cores on a KNL server for KMP_AFFINITY set to scatter and compact.

0 Kudos
SergeyKostrov
Valued Contributor II
757 Views

KMP_AFFINITY=scatter

CmmaKMPAFFINITYscatter.png

0 Kudos
SergeyKostrov
Valued Contributor II
757 Views

KMP_AFFINITY=compact

CmmaKMPAFFINITYcompact.png

0 Kudos
Reply