- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
We're getting MPI communication errors using Intel MPI on our cluster using omnipath. This is a job using 931 nodes, smaller runs using 600 nodes execute properly.
Other details:
We're using Intel Parallel Studio 2017 update 4 (compilers_and_libraries_2017.4.196).
There are 1024 total nodes on the fabric, we would like to run jobs utilizing the entire cluster.
This is an HPL run using Intel l_mklb_p_2017.3.017.
This is an example of the errors we see - what is interesting is the buffer and target size is the same, however the error states it is truncated. Is there normally a header the target buffer needs to have space for?
Fatal error in MPI_Recv: Message truncated, error stack:
MPI_Recv(224)................: MPI_Recv(buf=0x2b1ee8401840, count=1455, MPI_DOUBLE, src=17, tag=10001, comm=0x84000002, status=0x7ffef5ddfe50) failed
MPID_nem_tmi_handle_rreq(738): Message from rank 17 and tag 10001 truncated; 11640 bytes received but buffer size is 11640
Fatal error in MPI_Sendrecv: Message truncated, error stack:
MPI_Sendrecv(259)............: MPI_Sendrecv(sbuf=0x2b93ba000000, scount=1164, MPI_DOUBLE, dest=13, stag=10001, rbuf=0x2b93ba002460, rcount=1746, MPI_DOUBLE, src=13, rtag=10001, comm=0x84000002, status=0x7ffcec3f3f50) failed
MPID_nem_tmi_handle_rreq(738): Message from rank 13 and tag 10001 truncated; 13968 bytes received but buffer size is 13968
Fatal error in MPI_Sendrecv: Message truncated, error stack:
MPI_Sendrecv(259)............: MPI_Sendrecv(sbuf=0x2b30f5880808, scount=24576, MPI_DOUBLE, dest=16, stag=10001, rbuf=0x2b30ef400000, rcount=1164, MPI_DOUBLE, src=16, rtag=10001, comm=0x84000002, status=0x7ffc4278ec10) failed
- Tags:
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The question seems more appropriate to the cluster hpc forum, if you could quote intel cluster checker diagnoses.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page