The Intel MPI Library does a process pinning automatically. It also provides a set of options to control process pinning behavior. See the description of the I_MPI_PIN_* environment variables in the Reference Manual for details.
To control number of processes placed per node use the mpirun perhost option or I_MPI_PERHOST environment variable.
For instance, use the following syntax for your example using Intel MPI
$ mpirun perhost 2 env I_MPI_PIN_PROCESSOR_LIST 0,6 n 10
Set I_MPI_DEBUG to5 if you want to see process pining table.
Does it answer your question?
Usually the default pinning scheme work well for most customers. Let us know if you have special requirements. We will be able to disscuss possible solutions then.
I would like to pin MPI processes across all CPU sockets. For example, I would like to run 10 MPI processes on a two socket machine with 5 MPI processes on each socket. Could you please send me the instructions on doing this?
as mentioned in my question, I have 10 MPI ranks that I would like to run on a 20 core node with 5 MPI ranks on each socket. So the number of ranks and cores are not equal. I don't want the 10 MPI ranks to run on a single socket, as I would like the 5 MPI ranks to have their own NUMA region.