- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a workstation with Ubuntu 22.04 running the 6.5.0-14-generic kernel and an Intel Core i9-13900K processor. This processor has 8 P-cores and 16 E-cores.
I have a CFD application, and my experience is that running on the 8 P-cores are about as fast as running on all 24 cores together.
With Intel MPI 2021.7.1 I set the environment variable I_MPI_PIN_PROCESSOR_LIST="0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31". This means that when I run on only 8 cores, the 8 physical cores corresponding to the P-cores were used, and when I decided to run on 24 cores, all the physical cores was used.
With Intel MPI 2021.7.1 setting I_MPI_DEBUG=5 I got a confirmation on the terminal that the pinning was indeed working:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.7 Build 20221022 (id: f7b29a2495)
[0] MPI startup(): Copyright (C) 2003-2022 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
[0] MPI startup(): libfabric provider: tcp;ofi_rxm
[0] MPI startup(): File "/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI/etc/tuning_skx_shm-ofi_tcp-ofi-rxm_10.dat" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI/etc/tuning_skx_shm-ofi_tcp-ofi-rxm.dat"
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 731216 kmt-trd2 0
[0] MPI startup(): 1 731217 kmt-trd2 2
[0] MPI startup(): 2 731218 kmt-trd2 4
[0] MPI startup(): 3 731219 kmt-trd2 6
[0] MPI startup(): 4 731220 kmt-trd2 8
[0] MPI startup(): 5 731221 kmt-trd2 10
[0] MPI startup(): 6 731222 kmt-trd2 12
[0] MPI startup(): 7 731223 kmt-trd2 14
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/mglet-8.6.2+40-g87906f9/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=5
However, now trying Intel MPI 2021.11 (as part of the oneAPI 2024.0 toolkits), when I set the same, I get:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.11 Build 20231005 (id: 74c4a23)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/el-8/IntelMPI/opt/mpi/etc/tuning_skx_shm.dat"
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 738139 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 1 738140 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 2 738141 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 3 738142 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 4 738143 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 5 738144 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 6 738145 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): 7 738146 kmt-trd2 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/el-8/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=-1
[0] MPI startup(): I_MPI_PIN=1
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0,2,4,6,8,10,12,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
[0] MPI startup(): I_MPI_PIN_RESPECT_CPUSET=0
[0] MPI startup(): I_MPI_PIN_RESPECT_HCA=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_FABRICS=shm
[0] MPI startup(): I_MPI_DEBUG=5
I.e. it seems that the pinning is not applied properly.
I have tried a number of different variables (I_MPI_PIN_RESPECT_CPUSET=0,I_MPI_PIN_RESPECT_HCA=0), and none of them seems to affect the pinning at all.
Does anyone have an advice or insight in the issue?
Thanks in advance.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hakostra1
Could you please try to set:
I_MPI_PIN=1 I_MPI_PIN_CELL=core I_MPI_PIN_DOMAIN=1 I_MPI_PIN_ORDER=compact
instead of defining I_MPI_PIN_PROCESSOR_LIST?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does not appear to be working:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.11 Build 20231005 (id: 74c4a23)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): libfabric provider: tcp
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/home/hakostra/KMT/el-8/IntelMPI/opt/mpi/etc/tuning_skx_shm-ofi.dat"
[0] MPI startup(): ===== Nic pinning on kmt-trd2 =====
[0] MPI startup(): Rank Pin nic
[0] MPI startup(): 0 enp9s0f0
[0] MPI startup(): 1 enp9s0f0
[0] MPI startup(): 2 enp9s0f0
[0] MPI startup(): 3 enp9s0f0
[0] MPI startup(): 4 enp9s0f0
[0] MPI startup(): 5 enp9s0f0
[0] MPI startup(): 6 enp9s0f0
[0] MPI startup(): 7 enp9s0f0
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 1164811 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 1 1164812 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 2 1164813 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 3 1164814 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 4 1164815 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 5 1164816 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 6 1164817 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): 7 1164818 kmt-trd2 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31}
[0] MPI startup(): I_MPI_ROOT=/home/hakostra/KMT/el-8/IntelMPI
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=-1
[0] MPI startup(): I_MPI_PIN=1
[0] MPI startup(): I_MPI_PIN_CELL=core
[0] MPI startup(): I_MPI_PIN_DOMAIN=1
[0] MPI startup(): I_MPI_PIN_ORDER=compact
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=5
The only apparent difference is that now there are additional curly brackets { } around the pinned core list..
And here is the output of the cpuinfo tool:
cpuinfo
Intel(R) processor family information utility, Version 2021.11 Build 20231005 (id: 74c4a23)
Copyright (C) 2005-2023 Intel Corporation. All rights reserved.
===== Processor composition =====
Processor name : 13th Gen Intel(R) Core(TM) i9-13900K
Packages(sockets) : 1
Cores : 24
Processors(CPUs) : 32
Cores per package : 24
Threads per core : 1
===== Processor identification =====
Processor Thread Id. Core Id. Package Id.
0 0 0 0
1 1 0 0
2 0 4 0
3 1 4 0
4 0 8 0
5 1 8 0
6 0 12 0
7 1 12 0
8 0 16 0
9 1 16 0
10 0 20 0
11 1 20 0
12 0 24 0
13 1 24 0
14 0 28 0
15 1 28 0
16 0 32 0
17 0 33 0
18 0 34 0
19 0 35 0
20 0 36 0
21 0 37 0
22 0 38 0
23 0 39 0
24 0 40 0
25 0 41 0
26 0 42 0
27 0 43 0
28 0 44 0
29 0 45 0
30 0 46 0
31 0 47 0
===== Placement on packages =====
Package Id. Core Id. Processors
0 0,4,8,12,16,20,24,28,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47 (0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
===== Cache sharing =====
Cache Size Processors
L1 48 KB (0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)
L2 2 MB (0,1)(2,3)(4,5)(6,7)(8,9)(10,11)(12,13)(14,15)(16,17,18,19)(20,21,22,23)(24,25,26,27)(28,29,30,31)
L3 36 MB (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello hakostra1,
I could reproduce the behaviour in one of our machines. I'll let you know once I have a position from our engineers.
Cheers,
Rafael
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's good to hear, reproducing these issues are often difficult. Pinning is a key feature in any MPI implementation and I hope that you figure out what's going on here. Thanks for the feedback so far!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi haokstra1,
We disabled the pinning of MPI ranks for the scenario with mixed P- and E-cores in IMPI-2021.11. We recommend to revert to 2021.9 for the time being. We are discussing how to properly address the issue in future releases of Intel MPI.
Cheers,
Rafael
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page