Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Santak_D_
Beginner
54 Views

Running MPI over heterogeneuos Infiniband Nework

Hello,

I have a setup of Infiniband network where we are testing the performance of FDRs. I have two FDRs on the sender and one FDR on each receiver. There are two receivers. The idea is to run parallel sends from the sender on each FDR and receive it at receiver. We were trying mvapich, but mvapich stated that they clearly don't support such a network.

I was wondering if Intel MPI support such a network. And if we can do something like:

mpirun -n 2 -hosts Sender,Receiver1 -env MV2_IBA_HCA=mlx4_0 ./exec : -n 2 -hosts Sender,Receiver2 -env MV2_IBA_HCA=mlx4_1 ./exec

where mlx4_0 and mlx4_1 are the ids of the FDR cards. So I am trying to run ./exec parallel on different FDR cards to send data to both the receivers at the same time.

Is this possible using Intel MPI. If someone has a similar setup then please let me know.

 

Thanks,

Santak

0 Kudos
4 Replies
James_T_Intel
Moderator
54 Views

Hi Santak,

This is not a supported method.  The best suggestion I have is to try using OFED* multiple adapter capability.  To do this, set

[plain]I_MPI_FABRICS=shm:ofa[/plain]

on all of the ranks.  On the ranks on Sender, set

[plain]I_MPI_OFA_NUM_ADAPTERS=2[/plain]

and on the ranks on receiver, set

[plain]I_MPI_OFA_NUM_ADAPTERS=1[/plain]

I don't know if this will work, and I don't have a system to test it on.  I'm asking our developers for any additional information they may have.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

James_T_Intel
Moderator
54 Views

Hi Santak,

Ok, I stand corrected, this is a supported model.  What you'll need to do is set

[plain]I_MPI_CHECK_DAPL_PROVIDER_MISMATCH=0[/plain]

And run

[plain]mpirun -n 2 -hosts Sender,Receiver1 -env I_MPI_DAPL_PROVIDER=mlx4_0 ./exec : -n 2 -hosts Sender,Receiver2 -env I_MPI_DAPL_PROVIDER=mlx4_1 ./exec[/plain]

James.

Santak_D_
Beginner
54 Views

Thanks James for your updates. I was also trying few combinations and I figured out that this command:

mpirun -genv I_MPI_DEBUG 5 -genv I_MPI_FABRICS shm:ofa -n 1 -host Sender -env I_MPI_OFA_ADAPTER_NAME mlx4_0 ./exec : -n 1 -host Sender -env I_MPI_OFA_ADAPTER_NAME mlx4_1 ./exec : -n 1 -host Receiver1 -env I_MPI_OFA_ADAPTER_NAME mlx4_0 ./exec : -n 1 -host -env I_MPI_OFA_ADAPTER_NAME mlx4_0 Receiver2 ./exec

And as you have mentioned in the comment, Intel-MPI doesn't behave exactly as mvapich. So "-n 2 -hosts Sender,Receiver1" starts 2 processes on Sender and I see no executable running on Receiver1. If you notice above command, I have one line for each node. In this way I was able to run it.

 

James_T_Intel
Moderator
54 Views

The easier way to get that behavior is to use -ppn 1.  This puts one process per node.

Reply