Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2222 Discussions

How use processor pinning I_MPI_PIN_PROCESSOR_LIST shift preoffset

侯玉山
Novice
3,993 Views

mpiexec -genv I_MPI_DEBUG=4 -genv I_MPI_PIN_PROCESSOR_LIST=allcores:shift=3,preoffset=2 -n 24 ./bt-mz.C.x

result:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.1 Build 20201112 (id: b9c9d2fc5)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: tcp;ofi_rxm
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 389130 node1 2
[0] MPI startup(): 1 389131 node1 5
[0] MPI startup(): 2 389132 node1 8
[0] MPI startup(): 3 389133 node1 11
[0] MPI startup(): 4 389134 node1 14
[0] MPI startup(): 5 389135 node1 1
[0] MPI startup(): 6 389136 node1 3
[0] MPI startup(): 7 389137 node1 6
[0] MPI startup(): 8 389138 node1 9
[0] MPI startup(): 9 389139 node1 12
[0] MPI startup(): 10 389140 node1 15
[0] MPI startup(): 11 389141 node1 4
[0] MPI startup(): 12 389142 node1 7
[0] MPI startup(): 13 389143 node1 10
[0] MPI startup(): 14 389144 node1 13
[0] MPI startup(): 15 389145 node1 0
[0] MPI startup(): 16 389146 node1 2
[0] MPI startup(): 17 389147 node1 5
[0] MPI startup(): 18 389148 node1 8
[0] MPI startup(): 19 389149 node1 11
[0] MPI startup(): 20 389150 node1 14
[0] MPI startup(): 21 389151 node1 1
[0] MPI startup(): 22 389152 node1 3
[0] MPI startup(): 23 389153 node1 6

I want to know why the binding number is like this. Is there any formula?

 

Labels (3)
0 Kudos
1 Solution
James_T_Intel
Moderator
3,874 Views

This appears to be correct pinning. Using allcores will pin processes only by physical core, not by logical core. The cpuinfo output shows that you have Hyperthreading enabled, thus you have two logical cores per physical core. The preoffset of 2 specifies that the first rank will be moved by 2 cores, hence starting on core 2 instead of core 0. The shift of 3 says to move three cores for every new rank.


By using allcores, you are excluding cores numbered 16-31. Once you reach the end of your cores, pinning will cycle back to the start. We will shift to avoid pinning ranks to the same cores in most scenarios, but once all available cores have been used we will start oversubscribing cores. If you want to oversubscribe, you can explicitly set ranks to the same core (e.g. I_MPI_PIN_PROCESSOR_LIST=1,1...)


View solution in original post

7 Replies
PrasanthD_intel
Moderator
3,956 Views

Hi,


As per the specified options, you can observe a shift of 3 and an offset of 2 in the core pinning. The options roughly translate to 2+3n.

Have found any discrepancies in the core pinning?

How many cores are there? Could you provide us the output of cpuinfo -g?


Also following is the syntax of I_MPI_PIN_PROCESSOR_LIST,

I_MPI_PIN_PROCESSOR_LIST=[<procset>][:[grain=<grain>][,shift=<shift>][,preoffset=<preoffset>][,postoffset=<postoffset>]


If grain is not being specified, then the syntax should be, 

I_MPI_PIN_PROCESSOR_LIST=allcores:,shift=3,preoffset=2

However, with both variants, the pinning behavior doesn’t change.


Regards

Prasanth


0 Kudos
侯玉山
Novice
3,952 Views

Hi First of all, thank you for your answer

 

Below is information about cpuinfo:

===== Processor composition =====
Processor name : Intel(R) Xeon(R) E5-2650 v2
Packages(sockets) : 2
Cores : 16
Processors(CPUs) : 32
Cores per package : 8
Threads per core : 2

===== Processor identification =====
Processor Thread Id. Core Id. Package Id.
0 0 0 0
1 0 1 0
2 0 2 0
3 0 3 0
4 0 4 0
5 0 5 0
6 0 6 0
7 0 7 0
8 0 0 1
9 0 1 1
10 0 2 1
11 0 3 1
12 0 4 1
13 0 5 1
14 0 6 1
15 0 7 1
16 1 0 0
17 1 1 0
18 1 2 0
19 1 3 0
20 1 4 0
21 1 5 0
22 1 6 0
23 1 7 0
24 1 0 1
25 1 1 1
26 1 2 1
27 1 3 1
28 1 4 1
29 1 5 1
30 1 6 1
31 1 7 1
===== Placement on packages =====
Package Id. Core Id. Processors
0 0,1,2,3,4,5,6,7 (0,16)(1,17)(2,18)(3,19)(4,20)(5,21)(6,22)(7,23)
1 0,1,2,3,4,5,6,7 (8,24)(9,25)(10,26)(11,27)(12,28)(13,29)(14,30)(15,31)

===== Cache sharing =====
Cache Size Processors
L1 32 KB (0,16)(1,17)(2,18)(3,19)(4,20)(5,21)(6,22)(7,23)(8,24)(9,25)(10,26)(11,27)(12,28)(13,29)(14,30)(15,31)
L2 256 KB (0,16)(1,17)(2,18)(3,19)(4,20)(5,21)(6,22)(7,23)(8,24)(9,25)(10,26)(11,27)(12,28)(13,29)(14,30)(15,31)
L3 20 MB (0,1,2,3,4,5,6,7,16,17,18,19,20,21,22,23)(8,9,10,11,12,13,14,15,24,25,26,27,28,29,30,31)

 

mpiexec -genv I_MPI_DEBUG=4 -genv I_MPI_PIN_PROCESSOR_LIST=allcores:shift=3,preoffset=2 -n 24 ./bt-mz.C.x

0,16 1,17 2,18 3,19 4,20 5,21 6,22 7,23 8,24 9,25 10,26 11,27 12,28 13,29 14,30 15,31
    2     5     8     11     14  
  1   3     6     9     12     15
        4     7     10     13    
0   2     5     8     11     14  

 

I want to know how each row is arranged, and it doesn't seem to fit my settings. For example, the third behavior starts with 4

0 Kudos
PrasanthD_intel
Moderator
3,906 Views

Hi,

We are working on your thread. We have been in contact with the internal team as to why the pinning is not working as it is supposed to be.

Thanks for waiting, we will get back to you soon.

Regards

Prasanth

0 Kudos
PrasanthD_intel
Moderator
3,878 Views

Hi,


I am escalating this thread to the internal team for a better explanation of the pinning behavior.


Regards

Prasanth


0 Kudos
James_T_Intel
Moderator
3,875 Views

This appears to be correct pinning. Using allcores will pin processes only by physical core, not by logical core. The cpuinfo output shows that you have Hyperthreading enabled, thus you have two logical cores per physical core. The preoffset of 2 specifies that the first rank will be moved by 2 cores, hence starting on core 2 instead of core 0. The shift of 3 says to move three cores for every new rank.


By using allcores, you are excluding cores numbered 16-31. Once you reach the end of your cores, pinning will cycle back to the start. We will shift to avoid pinning ranks to the same cores in most scenarios, but once all available cores have been used we will start oversubscribing cores. If you want to oversubscribe, you can explicitly set ranks to the same core (e.g. I_MPI_PIN_PROCESSOR_LIST=1,1...)


侯玉山
Novice
3,835 Views

OK, thank you for your detailed reply. According to your reply, we have basically understood the problem. thank you!

0 Kudos
James_T_Intel
Moderator
3,752 Views

Intel customer support will no longer be monitoring this thread. Any further posts will be considered community only. For additional assistance on this topic, please post a new thread.


0 Kudos
Reply