- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Dear HPC mpi users
Iam trying to paralliaze a C code .when I try to run with 3 node I have been getting this error.
mpdboot_hpcmas02.kfupm.edu.sa (handle_mpd_output 892): failed to ping mpd on hpc073; received output={}
but program running smoothly in local.
mpd.hosts
hpc073
hpc074
I checked mpd is running and there is no firewall
kindly share your ideas
Regards
Ashraf
Link copiado
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Can you provide some details?
What MPI library do you use?
Can you make 'ssh hpc074' from hpc073 and vice versa without entering password and passphrase (if you use ssh connection)?
Could you provide 'mpdboot' command line?
Could you provide the output with '-d' option?
I'll try to help you to resolve this issue.
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
version 2.1 (MPI-2.1)
I have 3 node
master node (hpcmas01) and compute node (hpc073 and hpc074).
passwordless working from
hpcmaso1 to hpc073 and hpc074
mpdboot -r ssh -f /root/mpd.hosts -d
debug: starting
running mpdallexit on hpcmas02.kfupm.edu.sa
debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py --ncpus=1 --myhost=hpcmas02.kfupm.edu.sa -e -d -s 1
debug: mpd on hpcmas02.kfupm.edu.sa on port 55001
debug: info for running mpd: {'ip': '', 'ncpus': 1, 'list_port': 55001, 'entry_port': '', 'host': 'hpcmas02.kfupm.edu.sa', 'entry_host': '', 'ifhn': ''}
[root@hpcmas02 ~]#
Regards
Ashraf
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
>master node (hpcmas01) and compute node (hpc073 and hpc074)
Looks like your master node is hpcmas02!
Debug output doesn't show ip address of the hpcmas02 node:
>debug: info for running mpd: {'ip': '',
Could you run:
$ host hpcmas02
$ host hpc073
$ host hpc074
IP-addresses have to be configured. Probably you need to change /etc/hosts file.
Provide the output after command:
$ mpdboot -r ssh -f /root/mpd.hosts -n 3 -d
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Thanks for your reply.
Please find the out put of the command as you requested.
#host hpcmas02
Host hpcmas02 not found: 2(SERVFAIL).
[root@hpcmas02 etc]# host hpc073
Host hpc073 not found: 2(SERVFAIL)
[root@hpcmas02 etc]# host hpc074
Host hpc074 not found: 2(SERVFAIL)
But I checked /etc/hosts file, all entry there
10.146.1.200 hpcmgt hpcmgt
10.146.1.130 hpcmas02 hpcmas02
10.146.1.61 hpc061
10.146.1.62 hpc062
10.146.1.63 hpc063
10.146.1.73 hpc073
10.146.1.74 hpc074
10.146.1.76 hpc076
10.146.1.77 hpc077
10.146.1.78 hpc078
also
[root@hpcmas02 etc]# mpdboot -r ssh -f /root/mpd.hosts -n 3 -d
debug: starting
running mpdallexit on hpcmas02
debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py --ncpus=1 --myhost=hpcmas02 -e -d -s 3
debug: mpd on hpcmas02 on port 55000
debug: info for running mpd: {'ip': '10.146.1.130', 'ncpus': 1, 'list_port': 55000, 'entry_port': '', 'host': 'hpcmas02', 'entry_host': '', 'ifhn': ''}
debug: launch cmd= ssh -x -n -q hpc073 env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py -h hpcmas02 -p 55000 --ifhn=10.146.1.73 --ncpus=1 --myhost=hpc073 --myip=10.146.1.73 -e -d -s 3
debug: launch cmd= ssh -x -n -q hpc074 env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py -h hpcmas02 -p 55000 --ifhn=10.146.1.74 --ncpus=1 --myhost=hpc074 --myip=10.146.1.74 -e -d -s 3
debug: mpd on hpc073 on port 39268
debug: info for running mpd: {'ip': '10.146.1.73', 'ncpus': 1, 'list_port': 39268, 'entry_port': 55000, 'host': 'hpc073', 'entry_host': 'hpcmas02', 'ifhn': '', 'pid': 31767}
debug: mpd on hpc074 on port 44024
debug: info for running mpd: {'ip': '10.146.1.74', 'ncpus': 1, 'list_port': 44024, 'entry_port': 55000, 'host': 'hpc074', 'entry_host': 'hpcmas02', 'ifhn': '', 'pid': 31768}
Best Regards
Ashraf
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Could you check the mpd ring?
Run:
$ mpdtrace
When do you get the error message you included into the first post?
BTW: your /etc/nsswitch.conf should contain a line like:
hosts: files dns
to point where to look for ip-addresses
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
thanks for swift response
I put this caommand in master node (hpcmas02)
#mpdtrace (I hope it is working).
hpcmas02
hpc074
hpc073
Other nodes also getting same output.
It looks ,it is working now.
But how can test it.
Can you give a sample code.
Please
Thanks for your help
Regards
Ashraf
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
mpdtrace shows nodes in a mpd-ring. So, in your ring there are 3 nodes. Everything should work fine.
In installation directory there is 'test' sub-directory where you can find HelloWorld test cases (for fortran, c and c++ languages). Just compile them and try out.
# mpicc test.c
# mpiexec -n 8 ./a.out
I'm ready to help you if you have an issue.
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Actually Iam very happy now.
Now the program is running in parallel.
Thanks a lot
Kind Regards
Ashraf
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Best wishes,
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Sorry for bothering you.
What is the diffrence between mpi and openmp.
How can compile and run parallel same program using openmp.
kindly help
Best Regards
Ashraf
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
OpenMP is another programming paradigm. Using of pragma ("#pragma omp ...") you can point to compiler blocks of code which can be run in parallel (e.g. a loop) and compiler will try to create new threads. In MPI world mpiexec creates new processes.
You cannot compile MPI program by regular compiler.
To compile your openMP application you need to add '-openmp' option for Intel Compiler and '-fopenmp' for gcc compiler.
This forum is not the best place to get knowledge about openMP. Just google and you'll find a lot of information and examples.
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Urgent request.
Yestrday everthing was fine.But today when I run the program is is getting the folloeing error.
problem with execution of ./ashru.ex on hpc073: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc073: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc073: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc073: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpc074: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
problem with execution of ./ashru.ex on hpcmas02: [Errno 2] No such file or directory
Kindly help as soon possiblem.
please c the following out put.
[mulhem@hpcmas02 ~]$ mpdtrace
hpcmas02
hpc074
hpc073
mpdboot -r ssh -f /home/mulhem/mpd.hosts -n 3 -d
debug: starting
running mpdallexit on hpcmas02
debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py --ncpus=1 --myhost=hpcmas02 -e -d -s 3
debug: mpd on hpcmas02 on port 55001
debug: info for running mpd: {'ip': '10.146.1.130', 'ncpus': 1, 'list_port': 55001, 'entry_port': '', 'host': 'hpcmas02', 'entry_host': '', 'ifhn': ''}
debug: launch cmd= ssh -x -n -q hpc073 env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py -h hpcmas02 -p 55001 --ifhn=10.146.1.73 --ncpus=1 --myhost=hpc073 --myip=10.146.1.73 -e -d -s 3
debug: launch cmd= ssh -x -n -q hpc074 env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME /opt/intel/impi/4.0.0.028/intel64/bin/mpd.py -h hpcmas02 -p 55001 --ifhn=10.146.1.74 --ncpus=1 --myhost=hpc074 --myip=10.146.1.74 -e -d -s 3
debug: mpd on hpc073 on port 60394
debug: info for running mpd: {'ip': '10.146.1.73', 'ncpus': 1, 'list_port': 60394, 'entry_port': 55001, 'host': 'hpc073', 'entry_host': 'hpcmas02', 'ifhn': '', 'pid': 17230}
debug: mpd on hpc074 on port 54816
debug: info for running mpd: {'ip': '10.146.1.74', 'ncpus': 1, 'list_port': 54816, 'entry_port': 55001, 'host': 'hpc074', 'entry_host': 'hpcmas02', 'ifhn': '', 'pid': 17231}
kindly help as soon as possible.
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Are you sure that you entered correct name? Might be it should be ./ashru.exe?
Next time, please, provide full command line - analyzing will be much easier.
Regards!
Dmitry
- Marcar como novo
- Marcador
- Subscrever
- Silenciar
- Subscrever fonte RSS
- Destacar
- Imprimir
- Denunciar conteúdo inapropriado
Dear Sir,
I have a similar error:
mpdboot_node24 (handle_mpd_output 892): failed to ping mpd on node24.domain.com; received output={}
My "entry_port" and "entry_host" is blank on the mpdboot debug:
t0363nd@node26:/home/t0363nd> /appl/msc/msc2013.1/msc20131/linux64/intel/bin64/mpdboot -r ssh -f hosts -d
debug: starting
running mpdallexit on node26
debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /appl/msc/msc2013.1/msc20131/linux64/intel/bin64/mpd.py --ncpus=1 --myhost=node26 -e -d -s 1
debug: mpd on node26 on port 51090
debug: info for running mpd: {'ip': '10.134.160.55', 'ncpus': 1, 'list_port': 51090, 'entry_port': '', 'host': 'node26', 'entry_host': '', 'ifhn': ''
Is that part of my issue?
I am using legacy code with IntelMPI 2.3
Regards,
Joe
- Subscrever fonte RSS
- Marcar tópico como novo
- Marcar tópico como lido
- Flutuar este Tópico para o utilizador atual
- Marcador
- Subscrever
- Página amigável para impressora