- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I have tried to debug MPI_Win_lock because of high MPI imbalance.
In the attached example, one may notice that at final all MPI ranks are sleep on lock3, until root MPI rank did not finish his job. For example, on my machine with 16 MPI ranks the output looks like:
hop MPI: 0
no-hop MPI: 0
lock3, MPI: 14 delay: 8455487
lock3, MPI: 15 delay: 8469940
lock3, MPI: 0 delay: 15
Done!
lock3, MPI: 13 delay: 8427417
lock3, MPI: 6 delay: 7120527
lock3, MPI: 7 delay: 7459421
lock3, MPI: 8 delay: 7716278
lock3, MPI: 9 delay: 7959353
lock3, MPI: 10 delay: 8112826
lock3, MPI: 11 delay: 8249979
Done!
lock3, MPI: 1 delay: 1665868
Done!
lock3, MPI: 2 delay: 4028945
Done!
lock3, MPI: 3 delay: 5681485
Done!
lock3, MPI: 4 delay: 6210385
Done!
lock3, MPI: 5 delay: 6757722
Delay is given with microsecond resolution, so, lock time is from 1.6 to 8 seconds that is awful! For me, it looks like while root MPI rank will not finish his work, other MPI ranks are sleeping in Lock.
I have tried to use I_MPI_ASYNC_PROGRESS=1, but it leads to segfaults (that is problem too).
One of thread gives me that:
Image PC Routine Line Source
libpthread-2.28.s 00007F118DEFACF0 Unknown Unknown Unknown
libmpi.so.12.0.0 00007F118E82BCE5 Unknown Unknown Unknown
libmpi.so.12.0.0 00007F118E76FD37 Unknown Unknown Unknown
libmpi.so.12.0.0 00007F118E74C94E MPI_Win_lock Unknown Unknown
libmpifort.so.12. 00007F1198455C61 pmpi_win_lock_ Unknown Unknown
a.out 0000000000405A12 Unknown Unknown Unknown
a.out 000000000040674C Unknown Unknown Unknown
a.out 00000000004052ED Unknown Unknown Unknown
libc-2.28.so 00007F118D3CFD85 __libc_start_main Unknown Unknown
a.out 000000000040520E Unknown Unknown Unknown
Another one:
Image PC Routine Line Source
libpthread-2.28.s 00007FEBA38E8CF0 Unknown Unknown Unknown
libuct.so.0.0.0 00007FEB9FBE6F69 Unknown Unknown Unknown
libucp.so.0.0.0 00007FEB9FE4C82A ucp_worker_progre Unknown Unknown
libmlx-fi.so 00007FEBA00D7461 Unknown Unknown Unknown
libmlx-fi.so 00007FEBA00D5715 Unknown Unknown Unknown
libmlx-fi.so 00007FEBA00F2030 Unknown Unknown Unknown
libmpi.so.12.0.0 00007FEBA4504413 Unknown Unknown Unknown
libmpi.so.12.0.0 00007FEBA4219DDA Unknown Unknown Unknown
libmpi.so.12.0.0 00007FEBA415DD37 Unknown Unknown Unknown
libmpi.so.12.0.0 00007FEBA413A94E MPI_Win_lock Unknown Unknown
libmpifort.so.12. 00007FEBADE43C61 pmpi_win_lock_ Unknown Unknown
a.out 0000000000405A12 Unknown Unknown Unknown
a.out 000000000040674C Unknown Unknown Unknown
a.out 00000000004052ED Unknown Unknown Unknown
libc-2.28.so 00007FEBA2DBDD85 __libc_start_main Unknown Unknown
a.out 000000000040520E Unknown Unknown Unknown
Compilation of example:
mpiifx lock.f90 -cpp -O3
Running without I_MPI_ASYNC_PROGRESS (MPI_Win_lock will be slow):
mpirun -n 16 ./a.out
Running with I_MPI_ASYNC_PROGRESS (SegFault):
I_MPI_ASYNC_PROGRESS=1 mpirun -n 16 ./a.out
The code works fine with OpenMPI. Lock are fast, so I did not try any options like I_MPI_ASYNC_PROGRESS.
I used IntelMPI 2021.13 and IFX 2024.2.1 (latest available combination)
Igor
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
When we run the program, the memory used by the program is very large, so program is using swap memory space. ( "MPI_win_lock" uses physical memory. ) When we test it in systems without swap memory, the out of memory occurs . When the program uses swap memory, the performance comes out very slowly. If you reduce the memory size used in the main program, the issue will not occur.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I've extremely reduced the memory allocations of example load, so, not should not take more than 10 Mb/MPI rank.
Please, find it in the attachment.
I've also run it with I_MPI_DEBUG=10, so, it may help a little bit:
[0] MPI startup(): Intel(R) MPI Library, Version 2021.13 Build 20240701 (id: 179630a)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.20.1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Load tuning file: "/scratch/software/packages/intel/mpi/2021.13/opt/mpi/etc/tuning_generic_shm-ofi_mlx_hcoll.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 20 (TAG_UB value: 1048575)
[0] MPI startup(): source bits available: 21 (Maximal number of rank: 2097151)
[0] MPI startup(): Number of NICs: 1
[0] MPI startup(): ===== NIC pinning on vn01 =====
[0] MPI startup(): Rank Thread id Pin nic
[0] MPI startup(): 0 0 mlx
[0] MPI startup(): 1 0 mlx
[0] MPI startup(): 2 0 mlx
[0] MPI startup(): 3 0 mlx
[0] MPI startup(): 4 0 mlx
[0] MPI startup(): 5 0 mlx
[0] MPI startup(): 6 0 mlx
[0] MPI startup(): 7 0 mlx
[0] MPI startup(): 8 0 mlx
[0] MPI startup(): 9 0 mlx
[0] MPI startup(): 10 0 mlx
[0] MPI startup(): 11 0 mlx
[0] MPI startup(): 12 0 mlx
[0] MPI startup(): 13 0 mlx
[0] MPI startup(): 14 0 mlx
[0] MPI startup(): 15 0 mlx
[0] MPI startup(): ===== CPU pinning =====
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 763919 vn01 {0,1,2,48,49,50}
[0] MPI startup(): 1 763921 vn01 {3,4,5,51,52,53}
[0] MPI startup(): 2 763922 vn01 {6,7,8,54,55,56}
[0] MPI startup(): 3 763923 vn01 {9,10,11,57,58,59}
[0] MPI startup(): 4 763924 vn01 {12,13,14,60,61,62}
[0] MPI startup(): 5 763925 vn01 {15,16,17,63,64,65}
[0] MPI startup(): 6 763926 vn01 {18,19,20,66,67,68}
[0] MPI startup(): 7 763927 vn01 {21,22,23,69,70,71}
[0] MPI startup(): 8 763928 vn01 {24,25,26,72,73,74}
[0] MPI startup(): 9 763931 vn01 {27,28,29,75,76,77}
[0] MPI startup(): 10 763932 vn01 {30,31,32,78,79,80}
[0] MPI startup(): 11 763934 vn01 {33,34,35,81,82,83}
[0] MPI startup(): 12 763935 vn01 {36,37,38,84,85,86}
[0] MPI startup(): 13 763936 vn01 {39,40,41,87,88,89}
[0] MPI startup(): 14 763937 vn01 {42,43,44,90,91,92}
[0] MPI startup(): 15 763938 vn01 {45,46,47,93,94,95}
[0] MPI startup(): I_MPI_ROOT=/scratch/software/packages/intel/mpi/2021.13
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=10
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Logs in debug mode don't seem to be a problem. Does delay happen on small memory program?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Would you like to config I_MPI_PMI_LIBRARY?
ex) If you use slurm scheduler, you can set ;
export I_MPI_PMI_LIBRARY=/<path to slurm>/lib/libpmi2.so
Please refer to :
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I've found only libpmix.so on this computer, so, I tried only with libpmix.so. However , it says that I_MPI_PMI_LIBRARY will be ignored...
$ I_MPI_PMI_LIBRARY=/usr/lib64/libpmix.so I_MPI_PMI=pmix mpirun -n 16 ./a.out
MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found
Do you have other ideas?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
If you use slurm, you can refer to below.
https://slurm.schedmd.com/mpi_guide.html#intel_mpi
thanks.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page