Using InfiniBand network fabrics to allocate globally shared memory for processes on different nodes

ArthurRatz · ‎02-05-2016

Dear Collegues,

My MPI program implements a globally shared memory for processes on multiple nodes (hosts) using MPI_Win_allocate_shared, MPI_Comm_split_type functions calls. Unfortunately, the memory address space allocated is not actually shared between processes on different nodes. I'm wondering what will actually happen if I run my MPI program on a cluster with InfiniBand network and change the network fabrics to I_MPI_FABRICS=shm:dapl or something like that. Is this can be a solution of the following problem ?

Thanks in advance.

Cheers, Arthur.

Mark_L_Intel · ‎02-08-2016

The short answer is no. I pointed in (your) another post to the articles describing implementation of PGAS model using MPI-3/RMA. I'd also recommend to first look at the already available PGAS models such as OpenSHMEM before trying to implement PGAS model from scratch.

I_MPI_FABRICS=shm:dapl setting relies on the "normal" MPI-2 (MPI_Send/MPI_Recv, etc.) approach model rather than MPI-3 Shared memory. Obviously, this setting relies on IB and DAPL fabric between the nodes. I guess the confusion comes from the "shm" part. Indeed, this setting makes MPI library to use so-called shared memory fabric on the node. All it (shared memory fabric) means here is that MPI library would use certain internal optimizations to make MPI calls more efficient for the ranks residing on the same node (btw, sometimes using dapl even on the same node may produce better performance results). On the other hand, the MPI-3 shared memory approach gives developer a truly shared memory that can be accessed with direct load/store operations, exactly like threading in this sense.

Cheers,

Mark

ArthurRatz · ‎02-08-2016

Thanks for clarification, Mark

ArthurRatz · ‎02-09-2016

Is there ability to use MPI-3 shared memory to allocate a buffer in global address space that can shared among processes on different nodes ?