Software Archive
Read-only legacy content
17061 Discussions

MPI programs dont run on MIC's across nodes or within the nodes

aketh_t_
Beginner
666 Views

I am using the intel MIC to run sample MPI programs currently.

The program is a simple hello world program.

#include <stdio.h>
#include <mpi.h>


int main (argc, argv)
     int argc;
     char *argv[];
{
  int rank, size,len;
  char name[MPI_MAX_PROCESSOR_NAME];

  MPI_Init (&argc, &argv);      /* starts MPI */
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
  MPI_Get_processor_name(name,&len);
  printf( "Hello world from process %d of %d with name as %s\n", rank, size,name );
  MPI_Finalize();
  return 0;
}

i compiled the program using hostfile option. mpirun -f machinefile -n 4 ./a.out

The contents of the machinefile is 

cat machinefile
node1-mic0
node1-mic1

And the output i get is

Hello world from process 1 of 4 with name as node1-mic0
Hello world from process 2 of 4 with name as node1-mic0
Hello world from process 3 of 4 with name as node1-mic0
Hello world from process 0 of 4 with name as node1-mic0

However when i force the the program to split across the Intel MIC's, using the -ppn option, i encountered the following problem.

   mpirun -f machinefile -n 4 -ppn 2 ./a.out 

=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

 

 

 

0 Kudos
4 Replies
Artem_R_Intel
Employee
666 Views

Which version of MPI do you use?
Could you please try to run simple system utility like 'hostname' (instead of a.out) with the same mpirun options?
Have you set any specific MPI environment variables? If yes please specify them.

0 Kudos
aketh_t_
Beginner
666 Views

icc (ICC) 14.0.2 20140120
Copyright (C) 1985-2014 Intel Corporation.  All rights reserved.

i had only set the I_MPI_MIC=enable.
 

i am sorry but i have no idea about system utilities such as hostname?

can you guide me in that aspect.

Am i supposed to try to call the hostname function is it?

0 Kudos
aketh_t_
Beginner
666 Views

mpirun -f machinefile -n 16 -ppn 2 hostname
node1-mic0
node1-mic0
node1-mic0
node1-mic1
node1-mic0
node1-mic1
node1-mic1
node1-mic1
node2-mic0
node2-mic0
node2-mic0
node2-mic0
node2-mic1
node2-mic1
node2-mic1
node2-mic1

is does seem to work when i run hostname.

0 Kudos
Artem_R_Intel
Employee
666 Views

Thanks.
You've specified the Intel Compiler version.
Could you please try 'which mpirun' or 'mpirun -V' to get the exact version.
If not mistaken you use Intel MPI, so could you please run the original test scenario with the following environment variables I_MPI_DEBUG=6 and/or I_MPI_HYDRA_DEBUG=1.
Have you enabled IP forwarding as recommended in the Intel MPI User Guide?
11.2. Multiple Cards
To use multiple cards for a single job, the Intel® Manycore Platform Software Stack (Intel® MPSS) needs to be configured for peer-to-peer support (see the Intel® MPSS documentation for details) and the host(s) needs to have IP forwarding enabled.
(host)$ sudo sysctl -w net.ipv4.ip_forward=1
Each host/card should be able to ping every other host/card and the launching host should be able to connect to every target, as with a classic cluster.

0 Kudos
Reply