Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2285 Discussions

IntelMPI 2021.17.1, threading performance slow down

VladDE
Beginner
75 Views

Hi
I have the next issue. Could some explain what is wrong:
intelmpi 2021.17.1 version

There is simple example:
Leader: sends message to every worker using MPI_Send and waits answer using MPI_Probe and MPI_Recv.
Worker: receives request from leader sends answer.
With 50 ranks on the same host it's about 0.4 seconds.

The same example but before sending & receiving message started additional thread with own communicator, created using MPI_Comm_dup on leader and worker. It just wait specific message, using MPI_Probe and then MPI_Recv.
MPI is initialized as MPI::Init_thread(MPI_THREAD_MULTIPLE)
Then the same messaging, but it takes 24 seconds.!!!
After adding one more thread, in total 2 additional threads, it takes 35 seconds!!!

Even after adding MPI_Info with "thread_id" differently for every and the main thread as mentioned in
https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-6/mpi-thread-split-programming-model.html,
application started with I_MPI_THREAD_SPLIT=1 I_MPI_THREAD_RUNTIME=generic I_MPI_THREAD_MAX=3,
the time is slightly better, 14 seconds with 1 thread but still is not comparable as without threads.

Debug info shows that every rank has 3 threads
2025-12-10 10:03:39 : [0] MPI startup(): Rank Thread id Pin nic
2025-12-10 10:03:39 : [0] MPI startup(): 0 0 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 0 1 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 0 2 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 1 0 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 1 1 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 1 2 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 2 0 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 2 1 enp24s0f0
2025-12-10 10:03:39 : [0] MPI startup(): 2 2 enp24s0f0

....

2025-12-10 10:03:39 : [0] MPI startup(): THREAD_SPLIT mode is switched on, 3 endpoints in use

...

What can be the reason of so slow down?

Thanks

0 Kudos
0 Replies
Reply