- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I posted a question on http://software.intel.com/en-us/blogs/2013/02/15/new-kmp-place-threads-openmp-affinity-variable-in-update-2-compiler#comment-1736338
I'll post it here too since it'll probably be quicker.
<blockquote>
We want to use cores 0..29 for process 1. We want to use cores 30..60 for process 2. Here is an example of doing this:
Process 1 offload environment var setup:
export MIC_ENV_PREFIX=PHI
export PHI_KMP_AFFINITY=compact
export PHI_KMP_PLACE_THREADS=30c,4t,0O
export PHI_OMP_NUM_THREADS=120
Process 2 offload environment var setup:
export MIC_ENV_PREFIX=PHI
export PHI_KMP_AFFINITY=compact
export PHI_KMP_PLACE_THREADS=30c,4t,30O
export PHI_OMP_NUM_THREADS=120
</blockquote>
How could this be done if these were MPI processes for example? If I have a program where 1 MPI process offloads to 30 of the cores and another offloads to the other 30.
Can I specify thread placement for each MPI process at runtime?
Or is there an API where I can set it at compile time?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I posted a hint in the other thread http://software.intel.com/en-us/blogs/2013/02/15/new-kmp-place-threads-o...
(I won't say an answer, just a suggestion of the question you're really trying to ask...)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I didnt see a hint from James, but yes,at least with Intel MPI, each MPI rank can be specified individually with its own KMP_PLACE_THREADS setting (usually just an origin translation, if each rank has the same number of threads).
The case where this is most needed is the one where these MPI ranks are running on host and you want each to reserve a group of cores on MIC for use in offload mode (using the MIC_ENV_PREFIX option as you said).
In the case where MPI is running individual ranks under the control of Intel MPI on the MIC, the KMP_AFFINITY options such as balanced will work, with each taking its core assignments according to I_MPI_PIN_DOMAIN (and MIC_ENV_PREFIX doesn't apply).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another question. Can the MIC handle multiple offloads from different processes like that or do they need to be synchronised somehow (eg MPI)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need some way of specifying distinct affinity groups for each offload task. Running MPI on the host with each rank given a distinct MIC_KMP_PLACE_THREADS setting is a documented way of doing it. I suppose you could come up with other ways, such as using calls to the OpenMP library in the MIC offload tasks.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page