- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in Intel communities!
To assist you more effectively, could you kindly provide the following details:
Operating System (OS) Details
Intel MPI version
Output of the "lscpu" command
Hardware Details
Detailed Steps for Recreating the Scenario
Interconnect Details
Your cooperation in furnishing this information will greatly aid in addressing your concerns. Thank you in advance!
Regards,
Veena
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Sorry for the inconvenience caused.
Intel® MPI currently supports only PMI-1 and PMI-2, without support for PMIx. For optimal scalability, it is strongly recommended to configure this MPI implementation to use Slurm's PMI-2, as it offers superior scalability compared to PMI-1. While PMI-1 is still available, it is advised to transition to PMI-2, considering that PMI-1 may be deprecated in the near future. Your consideration of this recommendation is highly appreciated.
Regards,
Veena
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thankyou for the response. It is somewhat an answer, but there are some points you do not mention. A key one is that the SLURM documents I mentioned (and there are many similar) all say to use:
I_MPI_PMI_LIBRARY=/path/to/slurm/lib/libpmi2.so
However, with slurm as default in many cases this leads to
MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found
There are two other environmental variable which might be relevant
SLURM_MPI_TYPE=pmi2
I_MPI_PMI=pmi2
To date I see no difference using these. Can you please clarify what is appropriate with Intel impi, since currently I cannot find anything about how to use PMI2 in the available Intel documentation and the information in the slurm documentation out there appears to be incorrect.
--
N.B., This is for Wien2k, which is the standard benchmark code for density functional theory calculations, e.g. https://doi.org/10.1038/s42254-023-00655-3. This code does not just use a single mpirun (or srun), it is more intelligent (faster) and dispatches multiple mpi tasks to different nodes/cores. Therefore oversimple answers, alas, as less useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is one of the ways to use pmi2
$ salloc -N10 --exclusive $ export I_MPI_PMI_LIBRARY=/path/to/slurm/lib/libpmi2.so $ mpirun -np <num_procs> user_app.bin
Please follow the link for more information
https://slurm.schedmd.com/mpi_guide.html#intel_mpi
Do let me know if you face any issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, this is very incorrect, please see the prior information about the argument being ignored. The slurm info is not correct.
Also --exclusive is not an appropriate suggestion, that has too many other consequences.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @L__D__Marks, you are right that the slurm information in the intel documentation is not correct.
The environment variable seems to be correct.
Would it be possible for you to use "srun" instead of "mpirun/mpiexec"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately I have not found any way of using srun directly. The code runs a sequence of (what slurm calls) job steps. Some are serial and quick, the others are multiple parallel mpi tasks using different nodes. A schematic example would be
mpirun -np 8 -machinefile host1 &
mpirun -np 8 -machinefile host2 &
...wait for completion then do the next job step.
Similar to https://bugs.schedmd.com/show_bug.cgi?id=11863 it seems that "export SLURM_OVERLAP=1" matters, this appears to be common. (This is passed down through mpiexec.hydra which uses srun to launch.)
It is not clear to me whether I_MPI_PMI, SLURM_MPI_TYPE or even SLURM_OVERCOMMIT matter.
Unfortunately,currently I cannot switch to ssh launcher due to some form of misconfiguration where ssh is blocked on some of the nodes. Some sys_admins are trying to sort that out. Hence I can only at the moment test with srun launcher in mpirun.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @L__D__Marks
I understand your difficulty in running the application using 'srun'.
Please allow me a few days as I need to discuss this with the development teams to see whether there is any workaround where one could use mpirun with PMI2 instead of 'srun'.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @L__D__Marks
As it appears that the only way to use mpi2 with slurm is using srun.
I have the following output from an Intel MPI benchmarking program for your reference
MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: tcp
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.11/opt/mpi/etc/tuning_spr_shm-ofi.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 19 (TAG_UB value: 524287)
[0] MPI startup(): source bits available: 20 (Maximal number of rank: 1048575)
[0] MPI startup(): ===== Nic pinning on sdp4578 =====
[0] MPI startup(): Rank Pin nic
[0] MPI startup(): 0 enp1s0
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 1539366 sdp4578 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,8
3,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,10
7,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,12
7,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,14
7,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,16
7,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,18
7,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,20
7,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223}
[0] MPI startup(): 1 184901 sdp5259 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56
,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,8
3,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,10
7,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,12
7,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,14
7,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,16
7,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,18
7,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,20
7,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223}
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.11
[0] MPI startup(): ONEAPI_ROOT=/opt/intel/oneapi
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS=--external-launcher
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=10
[0] MPI startup(): I_MPI_PMI_LIBRARY=/usr/local/lib/libpmi2.so
#----------------------------------------------------------------
# Intel(R) MPI Benchmarks 2021.7, MPI-1 part
#----------------------------------------------------------------
# Date : Wed Jan 17 22:32:47 2024
# Machine : x86_64
# System : Linux
# Release : 5.15.0-86-generic
# Version : #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023
# MPI Version : 3.1
# MPI Thread Environment:
# Calling sequence was:
# IMB-MPI1 allreduce -msglog 2:3
# Minimum message length in bytes: 0
# Maximum message length in bytes: 8
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# Allreduce
#----------------------------------------------------------------
# Benchmarking Allreduce
# #processes = 2
#----------------------------------------------------------------
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
0 1000 0.02 0.03 0.03
4 1000 48.90 49.16 49.03
8 1000 48.87 48.90 48.88
# All processes entering MPI_Finalize
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @L__D__Marks
Could please also let me know the reason of using pmi2, as you mentioned briefly in your initial post that there is some crash/performance drop while using IntelMPI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The reason to try PMI2 is because all documentation (Intel's included) says PMI1 is inferior (obsolete). Your printout just confirms what I and others have reported, some more details please:
1) Are you running under slurm?
2) What launcher are you using
3) Does your test program report the protocol it is using? Would the line
# IMB-MPI1 allreduce -msglog 2:3
change if PMI2 is being used?
4) Did you set relevant environmental parameters:
export SLURM_MPI_TYPE =pmi2
export I_MPI_PMI =pmi2
Sorry. But your last messages don't answer the question. What information did the development team provide? Maybe they should respond (escalation).
N.B., PMIX may also be relevant.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @L__D__Marks
I am running this on a cluster which has
slurm 23.11
oneAPI 2024, which comes with Intel MPI 2021.11.
"IMB-MPI1 allreduce -msglog 2:3" is a benchmark problem available in Intel OneAPI suit to test mpi. You could choose your own mpi program. The environment variables which are in use, as shown in the previous reply
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.11
[0] MPI startup(): ONEAPI_ROOT=/opt/intel/oneapi
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS=--external-launcher
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=10
[0] MPI startup(): I_MPI_PMI_LIBRARY=/usr/local/lib/libpmi2.so
The important point which I wanted to make here that if you want to run using pmi2, then currently the only option is to use srun,
# Run your application using srun with the PMI-2 interface. I_MPI_PMI_LIBRARY=<path-to-libpmi2.so>/libpmi2.so srun --mpi=pmi2 ./myprog
For more information please check the following
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please read the prior posts, and do not respond with trivial answers.
Just using srun ./mprog is a novice response, inappropriate for professional hard-core supercomputing. I pointed out that this is inappropriate weeks ago.
Please escalate this to someone who is an expert, will read the prior information (including the fact that the page you suggest I read is wrong) and is knowledgeable. Hopefully then can construct a code which will show what interface is being used.
Escalate please.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@L__D__Marks
This forum is a community forum, not a support forum.
mpirun will always use our internal PMI library, if you want to use a different PMI library you have to provide the full path and use srun instead of mpirun.
I_MPI_PMI_LIBRARY=/path/to/slurm/lib/libpmi2.so
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have same "problem" with SLRUM on intelmpi need pmi2 with mpirun, and good to find out others have same srun problem here.
MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found
My workaround to make my mpi-intel/2021.u11's mpirun work is built my own ucx 1.15.0 and set LD_LIBRARY_PATH to it and using -env UCX_TLS rc,sm,self.
Test run with np=192, the srun with pmi2 is about 26 min wall time and mpirun + ucx1.15.0 is about 23 min, so they are close enough.
For me, this works for np up-to around 400 for my CFD type of simulation. After that, intelmpi die. Anything above np=400, I use MPI+threads hybrid approach to workaround this problem. Our system is AMD Genoa from Cray. On the other hand, OpenMPI seems doing just fine with mpirun. Our code on Genoa run faster with MPI+threads actually. So, in the very early stage, we were using mpirun to perform the core binding / threading pinning. HPC systems on our different sites run different type of job scheduler. Maybe srun can do it, but, same mpirun command is used by both SLURM and LSF (which is easier for me to do so.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Terrence_at_Houston
Sorry, but I really do not understand what's your question.
The initial question was if PMI2 is more thread safe than PMI1
To use PMI2 or PMIx srun has to be used together with setting the path. mpirun will just ignore this, hence prints out the warning.
If you are building your own UCX and setting some UCX env variables that has nothing to do with PMI/srun/mpirun.
UCX is always used with Infiniband networks / the mlx provider and UCX environment variables are always used by UCX.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page