I am using Intel MPI, build 20200312. I am trying to use various processor pinning options, like I_MPI_PIN_DOMAIN or I_MPI_PIN_PROCESSOR_LIST, and while the I_MPI_DEBUG output shows the expected behavior, the task manager or resource monitor do not show the processes as being pinned to a specific core. Am I missing something?
- Cluster Computing
- General Support
- Intel® Cluster Ready
- Message Passing Interface (MPI)
- Parallel Computing
The task manager does not show the CPU on which the process is being run, instead in the CPU column it shows the CPU utilization.
If you bind a process to a CPU using I_MPI_PIN_PROCESSOR_LIST you can observe a spike in resource monitor>CPU section.
If you have not observed any such spike can you try to launch a compute-intensive application or oversubscribe the CPU and check once?
Also, please provide you IMPI version and commands you were using so that we can cross-check.
Thanks for the quick reply.
The mpi version and command I am using, along with the debug output, is below:
Intel(R) MPI Library for Windows* OS, Version 2019 Update 7 Build 20200312
Copyright 2003-2020, Intel Corporation.
mpiexec -np 4 -env I_MPI_PIN 1 -env I_MPI_PIN_PROCESSOR_LIST 0,1,2,3 <myprog>
 MPI startup(): libfabric version: 1.9.1a1-impi
 MPI startup(): libfabric provider: tcp;ofi_rxm
 MPI startup(): Rank Pid Node name Pin cpu
 MPI startup(): 0 9752 xxx 0
 MPI startup(): 1 31912 xxx 1
 MPI startup(): 2 9604 xxx 2
 MPI startup(): 3 10792 xxx 3
According to the debug output, it is recognizing the request, but the resource monitor shows otherwise. See the attached I_mpi_pin_0-3.png which shows usage across all 6 cores instead of 4. For reference, our application has its own means of pinning by setting processor affinity. If we enable this, the attached set_affinity_0-3.png shows only the 4 cores enabled to be running. Our methodology is not applicable to more complex processor configurations (e.g. when process groups are involved), thus we were hoping to bypass it and use the binding options of Intel MPI, but they do not seem to work here.
We haven't received the screenshot you said you have attached.
We have tried pinning the processes on our machine and we have observed a spike on specific cores.
I am attaching a screenshot of the resource monitor.
Here you can observe a spike in the CPUs 0,1,2,3.
mpiexec -np 200 -env I_MPI_PIN 1 -env I_MPI_PIN_PROCESSOR_LIST 0,1,2,3 IMB-MPI1.exe ( I have oversubscribed so that it shows in the graph)
We choose to run the Intel Benchmark for this experiment.
Can you check again from your side once again?
>>For reference, our application has its own means of pinning by setting processor affinity. If we enable this, the attached set_affinity_0-3.png shows only the 4 cores enabled to be running.
Does your code do anything with affinity when your feature is disabled? (IOW disable pinning or set pinning to all available logical CPUs).
While your application is running...
Open Task Manager, click on Processes Tab, (for each of your processes):
Right-Click on one of your processes, click on Set Affinity, and observe/record the affinity sets
Repeat until you find a process that is not as you intended. Then use Alt-Print Screen to capture the dialog box and post here.
Hi Jim, Prasanth:
Thanks for the replay. We looked at that and indeed found our feature, when disabled, was in fact setting the process mask to the system mask using SetProcessAffinityMask(). This effectively reenabled all cores. When we disabled that, everything worked as expected, namely using I_MPI_PIN_... to restrict to a set of cores causes all execution to appear on those cores in Task Manager / Resource Monitor. Thanks for the help resolving this.
I have one more question related to this topic. Even though it operates as we need it to, restricting execution to the selected cores, when looking at the Set Affinity option in Task Manager, it appears all cores are enabled for each process. Our feature, which uses SetProcessAffinityMask(), shows only the selected cores enabled. This is not a problem, but I am curious as to why the difference. Does Intel's binding mechanism use something other than SetProcessAffinityMask()?
Finally, we are switching from a much older Intel MPI, and find some of the behavior of hydra different from smpd which we are used to. I will post these questions in a separate thread.
>>Does Intel's binding mechanism use something other than SetProcessAffinityMask()?
It may be using SetThreadAffinityMask() (on the startup thread of the rank process).