Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

MPI Broadcast lossless?

jimmy82
Novice
936 Views
Hi,

I will like to know if broadcast feature in MPI is 100%lossless. Do we need to handle cases where it is not lossless.

Thanks.
0 Kudos
9 Replies
Gergana_S_Intel
Employee
936 Views

Hi jimmy,

Are you referring to the new fault-tolerant behavior in Intel MPI Library 4.0? The FT support is limited and does not cover collectives. Feel free to check out the Reference Manual for details on provided functionality.

Regards,
~Gergana

0 Kudos
jimmy82
Novice
936 Views
I am referring not referring to fault tolerance. I need to check when doing a broadcast, is it guarnteed that all receipents will receive the broadcast data.
0 Kudos
Dmitry_K_Intel2
Employee
936 Views

Intel MPI Library works according to MPI standard: completion of MPI_Bcast on the root only says that it complete sending, completion on non-root guarantee that we got all data (without loses). There are no additional confirmations to root.

Regards!
Dmitry
0 Kudos
Jimmy821
Beginner
936 Views
Is it possible to force MPI_Bcast to perform a multi-cast?

I realise the time to broadcast 200MB of data increases signficant with increased nodes.
Say if I were to broadcast to 1 node, it takes < 2s to complete.
Broadcast to 2 nodes takes <6s to complete.
Broadcast to 3 nodes takes <10s to complete.

Forcing a multicast will allow me to complete transmitting to all nodes in the shortest time.
0 Kudos
Dmitry_K_Intel2
Employee
936 Views
All collective operations (at the final stage) are implemented as pt2pt communication using different algorithms. To implement multi-cast it should be supported in hardware.

There are 2 options you could try:
1. Play with different algorithms using I_MPI_ADJUST_BCAST environment variable - see Reference Manual.
2. OFA Fabric in the Intel MPI Library supports multi-rail feature. Set I_MPI_FABRICS=shm:ofa, I_MPI_OFA_NUM_ADAPTERS= e.g. 2 (1 by default), I_MPI_OFA_NUM_PORTS=. If your nodes have more than 1 interconnect (or multi-port interconnects) you can try this feature.

Regards!
Dmitry

0 Kudos
jimmy82
Novice
936 Views
How do you invoke these? I tried but was unsuccessful. Using a configuration file and invoked with mpiexec.exe .
0 Kudos
Dmitry_K_Intel2
Employee
936 Views
Jimmy,

Are you working on Windows?
OFA module is not supported on Windows platform! Sorry. And do not expect it in the nearest future. It means that you cannot use multi-rail feature either.

Regards!
Dmitry
0 Kudos
jimmy82
Novice
936 Views
I am referring to the I_MPI_ADJUST_BCAST parameter. How do I set this parameter? Can I have some examples?

Thanks.
0 Kudos
Dmitry_K_Intel2
Employee
936 Views
>I am referring to the I_MPI_ADJUST_BCAST parameter. How do I set this parameter? Can I have some examples?

Yeah, sure:
-genv I_MPI_ADJUST_BCAST '1:4-16;2:17-128;3:129-4096;7:4097-4000000'
Means that alrorithm 1 (Binominal) will be used for message from 4 to 16 bytes long, algorithm 2 (Recoursive doubling) for messages from 17 to 128 bytes long, algorithm 3 (Ring) for messages from 129 to 4K bytes long, algorithm 7 (Shumilin's) for large messages.

BTW: Intel MPI library doesn't support multi-cast communication.

Regards!
Dmitry
0 Kudos
Reply