Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

mpd won't start on a multi-core RHEL 5 Linux workstation

trong_buinasa_gov
521 Views
Hello,

I am would like to start mpd's on my multi-core RHEL 5 Linux workstation to run MPI software. However, when I tried starting mpd, I got the following errors:

mpdboot -f mpd.hosts
mpdboot_LX171264.dfrc.nasa.gov (handle_mpd_output 837): failed to ping mpd on localhost; received output={}

How can I fix this error? I already have ssh working with no prompting, and my firewall is off. Plus I am using mpd for a shared-memory, multi-core Linux workstation, and not across the network. I find it curious that mpd cannot even ping the localhost. I will appreciate any suggestions.

Thank you,

Trong

Below please find some more information:

mpdboot -V
Intel MPI Library for Linux, 64-bit applications, Version 3.2.1 Build 20090312
Copyright (C) 2003-2009 Intel Corporation. All rights reserved.

cat mpd.hosts
localhost:8

echo $I_MPI_DEVICE
shm

echo $I_MPI_PERHOST
8
0 Kudos
2 Replies
TimP
Honored Contributor III
521 Views

mpdboot -f mpd.hosts
mpdboot_LX171264.dfrc.nasa.gov (handle_mpd_output 837): failed to ping mpd on localhost; received output={}

How can I fix this error? I already have ssh working with no prompting,
If this is a single node, you shouldn't require the mpd.hosts file. If you wish mpdboot to set up with ssh, you must so specify:
mpdboot -r ssh

If you continue to have difficulty, please add mpdboot -v option and show us the result.
0 Kudos
trong_buinasa_gov
521 Views
Quoting - tim18
If this is a single node, you shouldn't require the mpd.hosts file. If you wish mpdboot to set up with ssh, you must so specify:
mpdboot -r ssh

If you continue to have difficulty, please add mpdboot -v option and show us the result.

Hi Tim,

I found the source of my problem. The domain of my domain user account was recently changed, but my userID remains the same. So when my userID in the new domain tries to write to the /tmp/mpd2.logfile_localhost_userID, it of course found an identical file there with an identical name, but was created under the previous user domain. The new user domain could not overwrite theprevious domain'smpd2 logfile. Hence the error message.

I deleted the old mpd2 logfile in the /tmp directory andfound that mpdboot, mpdtrace, and mpdallexit all work properly now.

Thank you very much for the prompt advice!

Trong
0 Kudos
Reply