HPL on Xeon and Xeon PHI

Holger_A_ · ‎12-05-2017

Hello, I would like to run Linpack on Broadwell and Knights Landing Xeons at the same time. It is running on both architectures separately, but fails with the following message, if I try to use both of them: - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 Fatal error in MPI_Sendrecv: Message truncated, error stack: MPI_Sendrecv(259)............: MPI_Sendrecv(sbuf=0x7f80ac848608, scount=18432, MPI_DOUBLE, dest=1, stag=10001, rbuf=0x7f80c7800000, rcount=8051, MPI_DOUBLE, src=1, rtag=10001, comm=0x84000002, status=0x7ffc9c14b3d0) failed MPID_nem_tmi_handle_rreq(688): Message from rank 1 and tag 10001 truncated; 64408 bytes received but buffer size is 64408 (75032 64408 61) When I compile HPL by myself, it is working, but rather slow. Is there a chance to get the Intel optimized version of HPL running? Best regards, Holger

Kazushige_G_Intel · ‎12-05-2017

Hello,

Please use same architecture for vertical nodes. MPI_Sendrecv failed because each architecture assumes different blocking size.

Thanks,

Kazushige Goto