Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Sangamesh_B_
Beginner
91 Views

mpdboot problem

Hi all,

I'm using Intel-MPI3 (icc & ifort 10 compilers) on a two node cluster with Ethernet interconnect.

The mpdboot command:

# mpdboot --totalnum=2 --file=/root/mpd.hosts --mpd=/opt/MPI_LIBS/INTEL-MPI/bin64/mpd --verbose --ncpus=4 --ifhn=10a0101

gave following error:

running mpdallexit on 10a0101
LAUNCHED mpd on 10a0101 via
RUNNING: mpd on 10a0101
LAUNCHED mpd on compute-0-0 via 10a0101
mpdboot_10a0101 (handle_mpd_output 589): from mpd on compute-0-0, invalid port info:
connect to address 10.255.255.254: Connection refused
connect to address 10.255.255.254: Connection refused
trying normal rsh (/usr/bin/rsh)
32833

If --rsh=/usr/bin/ssh option is used, mpdboot works fine. But again gives error during a job submission across 2 nodes.

With MPICH2, mpdboot and the job submission are working without any error.

I'm not getting why its not happening with Intel MPI.

Can someone help me out to resolve this issue?

- Sanagmesh
0 Kudos
4 Replies
Andrey_D_Intel
Employee
91 Views

Hi Sanagmesh,

It looks like a known bug. I belive that it should not appear in the latest release.

Package ID: l_mpi_p_3.1.026

Could you clarify the package ID for the Intel MPI Library you have? Itcan be found in the mpisupport.txt file. Would it be possible for you to do an upgrade if you have an older version?

Best regards, Andrey

Sangamesh_B_
Beginner
91 Views

I'm using:
Package ID: l_mpi_p_3.0.043

Is it happen in every cluster, if booted on >1 node?

Thanks
-Sangamesh
Andrey_D_Intel
Employee
91 Views

Is it acceptable for you to do an upgrade to Intel MPI Library 3.1? If not so I would suggest you request a patch for "invalid port info" issueat https://premier.intel.com. As far as I know it is available for 3.0.043 package
Sangamesh_B_
Beginner
91 Views

I upgraded the Intel MPI to 3.1 version. Now I can mpdboot without any errors.


Thanks..

-Sangamesh
Reply