I have an mpi job running on Gold 6142 with slurm. It takes 7680 cores total.
MPI Version: intelmpi-2017.2.174
TEST A: To take 256 nodes, and 30 cores per node. And every node has 32 cores. It takes 7245.24463 seconds to compute the job.
TEST B: To take 240 nodes, and 32 cores per node. And every node has 32 cores. It takes 11760.76172 seconds to compute the job.
I confused that why TEST B is much slower than TEST A.
=========
Infiniband Version:
perfquery BUILD VERSION: 1.6.7.MLNX20171022.b127f52 Build date: Oct 29 2017 13:03:53
Link Copied
I found that TEST B IB speed much slower than TEST A IB speed.
I test grapes_globale.
time_step_max = 5760
For more complete information about compiler optimizations, see our Optimization Notice.