Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2154 Discussions

Linux Intel(R) MPI Library, Version 2021.8 Build 20221129 (id: 339ec755a1) pinning problem

ALaza1
Novice
4,345 Views

User specified pinning works on the first two hosts but doesn't appear to be correct for a third host. Problem might be that two of the hosts are 18 cores each and the third host is 8 cores.

RAM= 256GB LRDIMM 2400

Linux = Ubuntu 22

Hosts handel1 and elgar1 are using

Processor name : Intel(R) Xeon(R) E5-2697 v4

Host mirella1 uses

Processor name : Intel(R) Xeon(R) E5-2667 v4

Linux mirella 5.15.0-57-generic #63-Ubuntu SMP Thu Nov 24 13:43:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Each system uses X550-T2  linux "bonding" option Adaptive Load Balancing.

December oneapi installed on mirella:

art@mirella:~$ ifort --version
ifort (IFORT) 2021.8.0 20221119
Copyright (C) 1985-2022 Intel Corporation. All rights reserved.

art@mirella:~$ mpiexec --version
Intel(R) MPI Library for Linux* OS, Version 2021.8 Build 20221129 (id: 339ec755a1)
Copyright 2003-2022, Intel Corporation.
Handel1 and elgar1 have pmn;y oneapi MPI installed.

Installed on mirella1:

l_HPCKit_p_2023.0.0.25400_offline.sh

l_BaseKit_p_2023.0.0.25537_offline.sh

 

Labels (1)
0 Kudos
14 Replies
ShivaniK_Intel
Moderator
4,319 Views

Hi,


Thanks for posting in the Intel forums.


As you have mentioned you have been using Linux = Ubuntu 22, it is not supported version by Intel MPI. For more details regarding the system requirements please refer to the below link.


https://www.intel.com/content/www/us/en/developer/articles/system-requirements/mpi-library-system-requirements.html


Could you please let us know whether you face any issues using the supported version of the OS?


Thanks & Regards

Shivani



0 Kudos
ALaza1
Novice
4,277 Views

Installed ubuntu v20.04.5 on 4 host systems here: faure, handel, elgar, and mirella. Installed Intel's December updates (base and HPC). Ran tests using 1, 2, 3, and 4 hosts. The problem doesn't appear with just 1 and 2 hosts (both 18 core CPUs). But 3 and 4 hosts both display bad pinning starting with host 3 (8 core xeon) and also with host 4  (6 core xeon). The test problem runs to completion with very good wall clock timings.

The zip includes the test runs for two different sized versions of the same source code, the NPB FT. The makefile produces an executable for a specific number of MPI ranks with the problem size equally divided over all the ranks. The number of ranks is constrained to a power of 2.

Size CLASS C=

Size : 512x 512x 512
Iterations : 20

 

Size CLASS D =

Size : 2048x1024x1024
Iterations : 25

My hosts elgar and handel each have a single Xeon (Broadwell) with 18 cores each. Hyperthreading is disabled.

Host mirella is a single Xeon (Broadwell) with 8 cores.

Host mirella is a single current Xeon with 6 cores.

Tested system configurations:

1 host, 16 ranks host handel

2 hosts, 32 ranks hosts handel and elgar, 16 ranks each

3 hosts, 32 cores, hosts handel and elgar with 12 ranks each and hosts mirella with 8 ranks.

4 hosts, 32 cores, hosts handel and elgar with 9 ranks each, host mirella with 8 ranks and host faure with 6 ranks.

 

Directory of C:\cygwin64\home\art\ubuntu_20.4.5_runs


01/17/2023 08:56 PM 4,529 ft_16_1_host_class_C.txt
01/17/2023 08:56 PM 4,825 ft_16_1_host_class_D.txt
01/17/2023 08:56 PM 5,237 ft_32_2_host_class_C.txt
01/17/2023 08:56 PM 5,582 ft_32_2_host_class_D.txt
01/17/2023 08:56 PM 5,209 ft_32_3_host_class_C.txt
01/17/2023 08:56 PM 5,554 ft_32_3_host_class_D.txt
01/17/2023 08:56 PM 5,058 ft_32_4_host_class_C.txt
01/17/2023 08:56 PM 5,403 ft_32_4_host_class_D.txt
01/18/2023 02:10 PM 1,358 mirella_config.txt
9 File(s) 42,755 bytes

The file mirella_config.txt includes Ubuntu and Intel version numbers and a copy of the run script. Only the specific -host entries are adjusted to configure each test run. The runs use the same MPI command arguments.

I did make a non-MPI change in my bonding configuration from Adaptive Load Balancing to Round-Robin after a test revealed better throughput along with the ability of round-robin to use both its slave NICs when only two hosts are being tested. Adaptive Load Balancing kicks in on a 3 host system.

Here are the clock times for each test.

ft_16_1_host_class_C.txt: Time in seconds = 20.18
ft_16_1_host_class_D.txt: Time in seconds = 490.83
ft_32_2_host_class_C.txt: Time in seconds = 16.93
ft_32_2_host_class_D.txt: Time in seconds = 321.46
ft_32_3_host_class_C.txt: Time in seconds = 13.07
ft_32_3_host_class_D.txt: Time in seconds = 279.54
ft_32_4_host_class_C.txt: Time in seconds = 14.64
ft_32_4_host_class_D.txt: Time in seconds = 260.59

regards,
Art

 

0 Kudos
ALaza1
Novice
4,307 Views

I'll give ubuntu 20.04 a try and let you know.

I used to experience (and reported) pinning problems like this using Centos 7 starting with Intel's 2019 MPI.

FYI Even with the pinning problem this test runs on 4 hosts (32 ranks) in about 302 sec. The Windows version doesn't have this pinning problem but its best run time on the same hardware and network is  370 sec.

 

regards,

Art

0 Kudos
ShivaniK_Intel
Moderator
4,239 Views

Hi,


Could you please confirm the correct CPU configuration for a host named Mirella you have mentioned two different CPU configurations.


Could you please provide us with the output from the cpuinfo command for each host involved?


Could you please run the sample hello world program using the below command line and provide us with the pinning details of all 4 hosts?


mpiifort $I_MPI_ROOT/test/test.f90 -o hello


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
4,196 Views


Hi,


As we didn't hear back from you, Could you please provide the details that have been asked in my previous post so that we can investigate more on your issue?


Thanks & Regards

Shivani


0 Kudos
ALaza1
Novice
4,111 Views

Mirella, handel and elgar are all E5-26xx v4 CPUs. Mirella has 8 cores. Handel and elgar have 18 cores and each has 256GB LRDIMM.  Faure is a 6 core i5-12500 with 64GB DIMM. Hyperthreading is disabled and OMP_NUM_THREADS=1.

BTW each of these systems has a Windows 10 drive, and both of my examples run OK using Windows 10.

I ran "hello" using two different pinning selections:
1)
-host handel1 -n 12 ./hello : \
-host elgar1 -n 12 ./hello : \
-host mirella1 -n 4 ./hello : \
-host faure1 -n 4 ./hello

2)
-host handel1 -n 9 ./hello : \
-host elgar1 -n 9 ./hello : \
-host mirella1 -n 8 ./hello : \
-host faure1 -n 6 ./hello
The pinning problem appears for both cases and as in my own test code, hello ran OK.

Sorry for the delay... I've been doing some work converting my windows 10 compiler/MPI environment to the January update.

Regards,

Art

0 Kudos
ShivaniK_Intel
Moderator
4,077 Views

Hi,


Could you please remove I_MPI_FABRICS=shm:tcp, as this is no longer a valid option? 


For more details on how the interconnect layer is currently configured and controlled, please refer to the below link.


https://www.intel.com/content/www/us/en/developer/articles/technical/mpi-library-2019-over-libfabric.html


This is not relevant to the issue at hand, but it is an easy fix to remove a warning.


Could you please try each of the following scenarios and provide us with the output?


mpirun -genv I_MPI_DEBUG 5 -host mirella1 -n 8 ./hello

mpirun -genv I_MPI_DEBUG 5 -host faure1 -n 6 ./hello

mpirun -genv I_MPI_DEBUG 5 -host handel1 -n 12 ./hello : -host elgar1 -n 12 ./hello

mpirun -genv I_MPI_DEBUG 5 -host faure1 -n 6 ./hello : -host mirella1 -n 6 ./hello : -host elgar1 -n 6 ./hello : -host handel1 -n 6 ./hello

mpirun -genv I_MPI_DEBUG 5 -host faure1 -n 6 ./hello : -host mirella1 -n 8 ./hello : -host elgar1 -n 18 ./hello : -host handel1 -n 18 ./hello



Thanks & Regards

Shivani


0 Kudos
ALaza1
Novice
4,055 Views

Hi Shivani,

pinning_tests.txt contains the output for the runs you requested here.

 

art

0 Kudos
ALaza1
Novice
3,961 Views

any update?

I just managed to complete a login using Intel's new stuff!

art

0 Kudos
ShivaniK_Intel
Moderator
3,984 Views

Hi,


We are working on it and will get back to you.


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
3,914 Views

Hi,


The issue has been escalated to the development team will update you soon.


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
3,568 Views

Hi,


Thanks for your patience.


Could you please try setting the I_MPI_PLATFORM=auto and let us know if you face similar issues?


If your issue still persists please provide us with the output log


Thanks & Regards

Shivani



0 Kudos
ShivaniK_Intel
Moderator
3,455 Views

Hi,


As we did not hear back from you could you please respond to my previous post?


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
3,008 Views

Hi,


I have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance please post a new question.


Thanks & Regards

Shivani


0 Kudos
Reply