IBM Benchmark gets too slow from 16 MPI processes

seongyun_k_ — Wed, 03 Feb 2016 03:18:32 GMT

Hi, folks

I am performing Intel MPI's IMB Benchmark to test the connectivity & bandwidth of my cluster server (CentOS 6.4, Mellanox Infiniband)

CMD: $ mpiexec.hydra -genv I_MPI_DEBUG 5 -genv I_MPI_FABIRCS dapl -machinefile machines -ppn 1 -n (# of Procs) IMB-RMA

When (# of Procs) is from 2~15, it shows the best performance almost up-to hardware's limitation.

However when I tested with (# of Procs) set as 16, the performance drops to x3000 slower !!!

Even though I change the topology of the MPI processed, the problem remains.

Does anybody know the related issue here?....

Seongyun, I recommend to

Gennady_F_Intel — Thu, 04 Feb 2016 05:46:40 GMT