- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Suppose I have a mpi program which will send/receive within the same rank:
#include <iostream>
#include <mpi.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int i1 = 2;
MPI_Request req1;
MPI_Isend(&i1, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &req1);
MPI_Status status1;
MPI_Wait(&req1, &status1);
std::cout << "sent\n";
int i2;
MPI_Status status2;
MPI_Recv(&i2, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &status2);
std::cout << i2 << "\n";
MPI_Finalize();
}
so this might be a bug related to intel mpi.
openmpi version:
$ mpirun --version
mpirun (Open MPI) 4.1.4
Report bugs to http://www.open-mpi.org/community/help/
intel mpi version:
$ mpirun --version
Intel(R) MPI Library for Linux* OS, Version 2021.11 Build 20231005 (id: 74c4a23)
Copyright 2003-2023, Intel Corporation.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
additional infomation:
$ ldd test
linux-vdso.so.1 (0x00007ffe1bdf2000)
libc++abi.so.1 => /lib/x86_64-linux-gnu/libc++abi.so.1 (0x00007febf62f2000)
libmpi.so.12 => /opt/intel/oneapi/mpi/2021.11/lib/libmpi.so.12 (0x00007febf4600000)
libc++.so.1 => /lib/x86_64-linux-gnu/libc++.so.1 (0x00007febf61ec000)
libunwind.so.1 => /lib/x86_64-linux-gnu/libunwind.so.1 (0x00007febf61dd000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007febf4521000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007febf61bd000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007febf4340000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007febf61b8000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007febf61b1000)
/lib64/ld-linux-x86-64.so.2 (0x00007febf635a000)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
another example:
#include <iostream>
#include <mpi.h>
int main(int argc, char** argv) {
MPI_Init(&argc, &argv);
int i1 = 2;
MPI_Request req1;
MPI_Isend(&i1, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, &req1);
MPI_Status status1;
MPI_Wait(&req1, &status1);
std::cout << "sent\n";
int i2;
MPI_Status status2;
while (true) {
MPI_Iprobe(0, 0, MPI_COMM_WORLD, &i2, &status2);
std::cout << i2 << '\n';
if (i2 != 0) {
break;
}
}
std::cout << "iprobe success\n";
MPI_Finalize();
}
when linked to openmpi, it ends normally.
when linked to intel mpi, the MPI_Iprobe never success, and the line 19 always prints 0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello,
the output of the first example looks like this:
$ I_MPI_DEBUG=10 mpirun -np 1 executable
[0] MPI startup(): Intel(R) MPI Library, Version 2021.11 Build 20231005 (id: 74c4a23)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: tcp
[0] MPI startup(): File "" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.11/opt/mpi/etc/tuning_icx_shm-ofi.dat"
[0] MPI startup(): File "/opt/intel/oneapi/mpi/2021.11/opt/mpi/etc/tuning_icx_shm-ofi.dat" not found
[0] MPI startup(): Load tuning file: "/opt/intel/oneapi/mpi/2021.11/opt/mpi/etc//tuning_clx-ap_shm-ofi.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 19 (TAG_UB value: 524287)
[0] MPI startup(): source bits available: 20 (Maximal number of rank: 1048575)
[0] MPI startup(): ===== Nic pinning on X1-Nano =====
[0] MPI startup(): Rank Pin nic
[0] MPI startup(): 0 ve-ArchLinux
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 143295 X1-Nano {0,1,2,3,4,5,6,7}
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.11
[0] MPI startup(): ONEAPI_ROOT=/opt/intel/oneapi
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_BIND_WIN_ALLOCATE=localalloc
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_RETURN_WIN_MEM_NUMA=-1
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=10
sent
then it hangs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please make sure that you are running in a supported environment:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm using debian 12 (which is the last stable debian version), is this too new?
the clang 17 and gcc 12.2.0 compiler are all tested and the program always hangs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Debian 13 is currently not validated / supported.
On our machines, it works fine, however, the code itself is unsafe and also talking to my colleagues, we strongly recommend to avoid that as it may or may not work the way you want it to work.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page