Community
cancel
Showing results for 
Search instead for 
Did you mean: 
AKOLKAR__RAHUL
Beginner
321 Views

MPI4PY installation

Hi All, 

    I'm HPC Admin. I have installed MPI4PY Library on clusters by .tar and Pip2.7(python2.7). After that, we are facing the issue like a 256cores job (n=4,ppn=64) is not running on nodes. It happened after installing MPI4PY(3.0). normal python code is running.

Users unable to run jobs on Cluster like VASP,MPI4PY, etc. 

The error is Given below:

[cli_0]: aborting job:
Fatal error in MPI_Init:
Other MPI error
 
[mpiexec@tyrone-node16] HYD_pmcd_pmiserv_send_signal (./pm/pmiserv/pmiserv_cb.c:184): assert (!closed) failed
[mpiexec@tyrone-node16] ui_cmd_cb (./pm/pmiserv/pmiserv_pmci.c:74): unable to send SIGUSR1 downstream
[mpiexec@tyrone-node16] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@tyrone-node16] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec@tyrone-node16] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completions
 
Kindly Help me.
 
Thanks in Advance!
 
Rahul Akolkar
0 Kudos
5 Replies
Frank_S_Intel
Employee
321 Views

Hi,

could you please provide the exact install commands you used. The output of "conda list" would also be very helpful to understand what's happening.

frank

AKOLKAR__RAHUL
Beginner
321 Views

Hi Frank,

Actually, the ib_sdb error was in the boot process, I resolved that issue. But still, I get this error message when I trying to run MPI sample Codes.

This is not the problem by the MPI4PY library.

MPI path

source /opt/mvapich2-1.8-r5423/intel/etc/mvapich_vars.csh

 

 

Frank_S_Intel
Employee
321 Views

Hi,

you seem to be using MVAPICH. You have to make sure your mpi4py is built against the MPI library you are using underneath.

Please let us know the exact commands you used to install mpi4py and MPI.

frank

AKOLKAR__RAHUL
Beginner
321 Views

Hi Frank,

I used the command to install for this node: #pip2.7 install mpi4py.

when I invoke mpi4py, it worked. But when users tried to submit VASP job then they got the above  HYD_pmcd_pmiserv_send_signal (./pm/pmiserv/pmiserv_cb.c:184): assert (!closed) failed

We are using MVAPICH, open64 and gcc compilers

Thanks!

Rahul A.

Frank_S_Intel
Employee
321 Views

Thanks for the info. This might be an issue with your MVAPICH installation or a incorrect build of mpi4py (e.g. it might use a different MPI at built time). Unfortunately I can't say for sure.

Intel's mpi4py package uses Intel(R) MPI library and comes pre-compiled. You might want to try conda-install'ing it (and also use Intel(R) Distribution for Python). Let us know if you encounter any issue with our mpi4py package.