Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
2275 Discussions

Pinning problem on Intel MPI 2021.11

hakostra1
New Contributor II
2,848 Views

I have a workstation with Ubuntu 22.04 running the 6.5.0-14-generic kernel and an Intel Core i9-13900K processor. This processor has 8 P-cores and 16 E-cores.

I have a CFD application, and my experience is that running on the 8 P-cores are about as fast as running on all 24 cores together.

With Intel MPI 2021.7.1 I set the environment variable I_MPI_PIN_PROCESSOR_LIST="0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31". This means that when I run on only 8 cores, the 8 physical cores corresponding to the P-cores were used, and when I decided to run on 24 cores, all the physical cores was used.

With Intel MPI 2021.7.1 setting I_MPI_DEBUG=5 I got a confirmation on the terminal that the pinning was indeed working:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.7  Build 20221022 (id: f7b29a2495)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: tcp;ofi_rxm
[0] MPI startup(): File "/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI/etc/tuning_skx_shm-ofi_tcp-ofi-rxm_10.dat" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI/etc/tuning_skx_shm-ofi_tcp-ofi-rxm.dat"
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       731216   kmt-trd2   0
[0] MPI startup(): 1       731217   kmt-trd2   2
[0] MPI startup(): 2       731218   kmt-trd2   4
[0] MPI startup(): 3       731219   kmt-trd2   6
[0] MPI startup(): 4       731220   kmt-trd2   8
[0] MPI startup(): 5       731221   kmt-trd2   10
[0] MPI startup(): 6       731222   kmt-trd2   12
[0] MPI startup(): 7       731223   kmt-trd2   14
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=5

However, now trying Intel MPI 2021.11 (as part of the oneAPI 2024.0 toolkits), when I set the same, I get:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.11  Build 20231005 (id: 74c4a23)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/el-8/IntelMPI/opt/mpi/etc/tuning_skx_shm.dat"
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       738139   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 1       738140   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 2       738141   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 3       738142   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 4       738143   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 5       738144   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 6       738145   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 7       738146   kmt-trd2   0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/el-8/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=-1
[0] MPI startup(): I_MPI_PIN=1
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_PIN_RESPECT_CPUSET=0
[0] MPI startup(): I_MPI_PIN_RESPECT_HCA=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_DEBUG=5

I.e. it seems that the pinning is not applied properly.

I have tried a number of different variables (I_MPI_PIN_RESPECT_CPUSET=0,I_MPI_PIN_RESPECT_HCA=0), and none of them seems to affect the pinning at all.

Does anyone have an advice or insight in the issue?

Thanks in advance.

 

Labels (2)
0 Kudos
5 Replies
TobiasK
Moderator
2,777 Views

@hakostra1 
Could you please try to set:

I_MPI_PIN=1 I_MPI_PIN_CELL=core I_MPI_PIN_DOMAIN=1 I_MPI_PIN_ORDER=compact

instead of defining I_MPI_PIN_PROCESSOR_LIST?

0 Kudos
hakostra1
New Contributor II
2,774 Views

Does not appear to be working:

[0] MPI startup(): Intel(R) MPI Library, Version 2021.11  Build 20231005 (id: 74c4a23)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): libfabric provider: tcp
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/el-8/IntelMPI/opt/mpi/etc/tuning_skx_shm-ofi.dat"
[0] MPI startup(): ===== Nic pinning on kmt-trd2 =====
[0] MPI startup(): Rank	Pin nic
[0] MPI startup(): 0	enp9s0f0
[0] MPI startup(): 1	enp9s0f0
[0] MPI startup(): 2	enp9s0f0
[0] MPI startup(): 3	enp9s0f0
[0] MPI startup(): 4	enp9s0f0
[0] MPI startup(): 5	enp9s0f0
[0] MPI startup(): 6	enp9s0f0
[0] MPI startup(): 7	enp9s0f0
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       1164811  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 1       1164812  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 2       1164813  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 3       1164814  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 4       1164815  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 5       1164816  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 6       1164817  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): 7       1164818  kmt-trd2   {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                 30,31}
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/el-8/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=-1
[0] MPI startup(): I_MPI_PIN=1
[0] MPI startup(): I_MPI_PIN_CELL=core
[0] MPI startup(): I_MPI_PIN_DOMAIN=1
[0] MPI startup(): I_MPI_PIN_ORDER=compact
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=5

The only apparent difference is that now there are additional curly brackets { } around the pinned core list..

 

And here is the output of the cpuinfo tool:

cpuinfo 
Intel(R) processor family information utility, Version 2021.11 Build 20231005 (id: 74c4a23)
Copyright (C) 2005-2023 Intel Corporation.  All rights reserved.

=====  Processor composition  =====
Processor name    : 13th Gen Intel(R) Core(TM) i9-13900K  
Packages(sockets) : 1
Cores             : 24
Processors(CPUs)  : 32
Cores per package : 24
Threads per core  : 1

=====  Processor identification  =====
Processor	Thread Id.	Core Id.	Package Id.
0       	0   		0   		0   
1       	1   		0   		0   
2       	0   		4   		0   
3       	1   		4   		0   
4       	0   		8   		0   
5       	1   		8   		0   
6       	0   		12  		0   
7       	1   		12  		0   
8       	0   		16  		0   
9       	1   		16  		0   
10      	0   		20  		0   
11      	1   		20  		0   
12      	0   		24  		0   
13      	1   		24  		0   
14      	0   		28  		0   
15      	1   		28  		0   
16      	0   		32  		0   
17      	0   		33  		0   
18      	0   		34  		0   
19      	0   		35  		0   
20      	0   		36  		0   
21      	0   		37  		0   
22      	0   		38  		0   
23      	0   		39  		0   
24      	0   		40  		0   
25      	0   		41  		0   
26      	0   		42  		0   
27      	0   		43  		0   
28      	0   		44  		0   
29      	0   		45  		0   
30      	0   		46  		0   
31      	0   		47  		0   
=====  Placement on packages  =====
Package Id.	Core Id.	Processors
0   		0,4,8,12,16,20,24,28,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47		(0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31

=====  Cache sharing  =====
Cache	Size		Processors
L1	48  KB		(0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)
L2	2   MB		(0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)(16,17,18,19)(20,21,22,23)(24,25,26,27)(28,29,30,31)
L3	36  MB		(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31)

 

0 Kudos
Rafael_L_Intel
Employee
2,686 Views

Hello hakostra1,

 

I could reproduce the behaviour in one of our machines. I'll let you know once I have a position from our engineers.

 

Cheers,

Rafael

0 Kudos
hakostra1
New Contributor II
2,684 Views

That's good to hear, reproducing these issues are often difficult. Pinning is a key feature in any MPI implementation and I hope that you figure out what's going on here. Thanks for the feedback so far!

0 Kudos
Rafael_L_Intel
Employee
2,636 Views

Hi haokstra1,

 

We disabled the pinning of MPI ranks for the scenario with mixed P- and E-cores in IMPI-2021.11. We recommend to revert to 2021.9 for the time being. We are discussing how to properly address the issue in future releases of Intel MPI. 

 

Cheers,

Rafael

0 Kudos
Reply