- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to run a fortran code, but when I run it I'm getting this message:
Fatal error in PMPI_Waitall: Other MPI error, error stack:
PMPI_Waitall(405)...............: MPI_Waitall(count=5, req_array=0xb0b8c8, status_array=0xb14e68) failed
MPIR_Waitall_impl(221)..........: fail failed
PMPIDI_CH3I_Progress(623).......: fail failed
pkt_RTS_handler(317)............: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(288): fail failed
dcp_recv(154)...................: Internal MPI error! cannot read from remote process
If i run it in root, it works.
I try updating my parallel studio xe from 2017 to 2017.1, but didn't make any difference.
Any news/help with this issue?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Check ulimits -a for both root and a normal user. These should be the same.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I have got the same error:
Fatal error in PMPI_Gatherv: Other MPI error, error stack:
PMPI_Gatherv(1001)..............: MPI_Gatherv failed(sbuf=0x256d2c0, scount=9210, MPI_DOUBLE, rbuf=0x258c560, rcnts=0x2559f00, displs=0x2559f20, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
MPIR_Gatherv_impl(545)..........: fail failed
I_MPIR_Gatherv_intra(611).......: fail failed
MPIR_Gatherv(422)...............: fail failed
MPIC_Irecv(857).................: fail failed
MPID_Irecv(160).................: fail failed
MPID_nem_lmt_RndvRecv(208)......: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(302): fail failed
dcp_recv(165)...................: Internal MPI error! Cannot read from remote process
Does Intel have MPI version with a solution?
Victor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I have the same error. Does Intel have MPI version without the error?
Error log:
Fatal error in PMPI_Gatherv: Other MPI error, error stack:
PMPI_Gatherv(1001)..............: MPI_Gatherv failed(sbuf=0x256d2c0, scount=9210, MPI_DOUBLE, rbuf=0x258c560, rcnts=0x2559f00, displs=0x2559f20, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
MPIR_Gatherv_impl(545)..........: fail failed
I_MPIR_Gatherv_intra(611).......: fail failed
MPIR_Gatherv(422)...............: fail failed
MPIC_Irecv(857).................: fail failed
MPID_Irecv(160).................: fail failed
MPID_nem_lmt_RndvRecv(208)......: fail failed
do_cts(662).....................: fail failed
MPID_nem_lmt_dcp_start_recv(302): fail failed
dcp_recv(165)...................: Internal MPI error! Cannot read from remote process
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is important to check the "ulimit" from inside the context of the MPI job -- if you log in to the remote node a different set of limits may apply.
You can easily do this by launching a script instead of the MPI executable and having the script echo the hostname and then execute "ulimit -a".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Actually I've used other workaround. I set "export I_MPI_SHM_LMT=shm". And the issue has been resolved.
But It will be nice to just update MPI. Does the issue fixed or will be fixed in nearest future?
Our group going to register our software locally (target date is October) and any kind of used workaround should be properly described :(
Victor

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page