Software Archive
Read-only legacy content
17061 Discussions

KMP_PLACE_THREADS OpenMP affinity variable

james_B_8
Beginner
904 Views

I posted a question on http://software.intel.com/en-us/blogs/2013/02/15/new-kmp-place-threads-openmp-affinity-variable-in-update-2-compiler#comment-1736338

I'll post it here too since it'll probably be quicker.

<blockquote>
We want to use cores 0..29 for process 1. We want to use cores 30..60 for process 2. Here is an example of doing this:

Process 1 offload environment var setup:
export MIC_ENV_PREFIX=PHI
export PHI_KMP_AFFINITY=compact
export PHI_KMP_PLACE_THREADS=30c,4t,0O
export PHI_OMP_NUM_THREADS=120
Process 2 offload environment var setup:
export MIC_ENV_PREFIX=PHI
export PHI_KMP_AFFINITY=compact
export PHI_KMP_PLACE_THREADS=30c,4t,30O
export PHI_OMP_NUM_THREADS=120
</blockquote>

How could this be done if these were MPI processes for example? If I have a program where 1 MPI process offloads to 30 of the cores and another offloads to the other 30.
Can I specify thread placement for each MPI process at runtime?
Or is there an API where I can set it at compile time?

0 Kudos
4 Replies
James_C_Intel2
Employee
904 Views

I posted a hint in the other thread http://software.intel.com/en-us/blogs/2013/02/15/new-kmp-place-threads-o...

(I won't say an answer, just a suggestion of the question you're really trying to ask...)

0 Kudos
TimP
Honored Contributor III
904 Views

I didnt see a hint from James, but yes,at least with Intel MPI, each MPI rank can be specified individually with its own KMP_PLACE_THREADS setting (usually just an origin translation, if each rank has the same number of threads).

The case where this is most needed is the one where these MPI ranks are running on host and you want each to reserve a group of cores on MIC for use in offload mode (using the MIC_ENV_PREFIX option as you said). 

In the case where MPI is running individual ranks under the control of Intel MPI on the MIC, the KMP_AFFINITY options such as balanced will work, with each taking its core assignments according to I_MPI_PIN_DOMAIN (and MIC_ENV_PREFIX doesn't apply).

0 Kudos
james_B_8
Beginner
904 Views

Another question. Can the MIC handle multiple offloads from different processes like that or do they need to be synchronised somehow (eg MPI)?

0 Kudos
TimP
Honored Contributor III
904 Views

You need some way of specifying distinct affinity groups for each offload task.  Running MPI on the host with each rank given a distinct MIC_KMP_PLACE_THREADS setting is a documented way of doing it.  I suppose you could come up with other ways, such as using calls to the OpenMP library in the MIC offload tasks.

0 Kudos
Reply