Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2161 Discussions

version incompatibility inside a cluster

Eliomar_C_
Beginner
300 Views

Hey, good morning. My name is Eliomar, im from venezuela. im a bit new working with MPI, im making a masters project with this technology. But im facing a problem with my implementation that i dont know how to face it.

The problem is, i have a cluster, composed of 4 skylakes and 1 knl. im trying to run a program in knl from skylake. In knl i have installed the 2017 version and in skylakes i have 2015 version. in the beggining it crashes with a bash error saying that the file or directory does not exist. thats correct because i dont have same versions on the knl and when i was doing MPI_Comm_spawn to the knl it should be right that error. i thought i solved that error setting the root environmental variables, but then when i run mi program in the cluster at the moment of spawn a new process but now in the knl. the programs just hangs there. the errors that it says are: (after i push ctrl+c)

HYDU_sock_write (../../utils/sock/sock.c:417): write error (Bad file descriptor)

HYD_pmcd_pmiserv_send_signal (../../pm/pmiserv/pmiserv_cb.c:246): unable to write data to proxy

ui_cmd_cb (../../pm/pmiserv/pmiserv_pmci.c:172): unable to send signal downstream

HYDT_dmxu_poll_wait_for_event (../../tools/demux/demux_poll.c:76): callback returned error status

HYD_pmci_wait_for_completion (../../pm/pmiserv/pmiserv_pmci.c:480): error waiting for event

main (../../ui/mpich/mpiexec.c:945): process manager error waiting for completion

Any help would be great. if you have a related problem with mine.

Regards from venezuela.

0 Kudos
0 Replies
Reply