Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2154 Discussions

How to make mpiexec or mpirun to launch processes ordered by their rank ?

ArthurRatz
Novice
2,947 Views

Dear Collegues, How to make mpiexec or mpirun to launch processes ordered by their rank ?

I've already tried to launch a simple process under Windows and Linux:

 int namelen, numprocs, proc_rank, tmp = 1;
 char processor_name[MPI_MAX_PROCESSOR_NAME];

 unsigned long array_size = 100;

 long* array = (long*)calloc(sizeof(long), array_size);

 MPI_Init(&argc, &argv);

 MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
 MPI_Comm_rank(MPI_COMM_WORLD, &proc_rank);

 MPI_Get_processor_name(processor_name, &namelen);

 fprintf(stdout, "[%s@mpiexec] process %d of %d\n", processor_name, proc_rank, numprocs);

 MPI_Finalize();

mpiexec -ordered-output -al 2:P --n 4 simplempi.exe

mpirun --map-by core -np 4 simplempi.exe

This takes no effect. Processes are anyway running in their arbitrary order.

So, please, if it's possible, please help me to do that so processes are launched ordered by their ranks, e.g. 0,1,2,3,4,...

Waiting for your reply. Arthur.

 

0 Kudos
1 Solution
Andrey_Vladimirov
New Contributor III
2,946 Views

I believe the original poster is asking how to order the launch of MPI processes in time, is this correct?

To the best of my knowledge, you cannot make MPI processes launch in a certain order, but if you need a certain part of your application to run in the order of rank number, you can use a barrier like below:

#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
  MPI_Init(&argc, &argv);
  int i, nRanks, myRank;
  MPI_Comm_size(MPI_COMM_WORLD, &nRanks);
  MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

  for(i = 0; i < nRanks; i++) {
    MPI_Barrier(MPI_COMM_WORLD);
    if (i == myRank) {
      printf("Hello World from Rank %d\n", myRank);
    }
  }

  return 0;
}
[andrey@alma-ata ~]$ mpiicc ordered.c
[andrey@alma-ata ~]$ mpirun -np 10 ./a.out 
Hello World from Rank 0
Hello World from Rank 1
Hello World from Rank 2
Hello World from Rank 3
Hello World from Rank 4
Hello World from Rank 5
Hello World from Rank 6
Hello World from Rank 7
Hello World from Rank 8
Hello World from Rank 9

 

View solution in original post

0 Kudos
19 Replies
Kittur_G_Intel
Employee
2,946 Views

Hi Arthur,
The link at: https://software.intel.com/en-us/node/528898  gives information on the environment variable setting for I_MPI_PIN_PROCESSOR_LIST which should help.
_Kittur

0 Kudos
ArthurRatz
Novice
2,946 Views

Hi _Kittur. Can you provide an example of I_MPI_PIN_PROCESSOR_LIST usage so it orderes the running processes ?

I'm sorry, but I'm just a little bit confused of using this environment variable

Thanks, Arthur..

0 Kudos
Andrey_Vladimirov
New Contributor III
2,947 Views

I believe the original poster is asking how to order the launch of MPI processes in time, is this correct?

To the best of my knowledge, you cannot make MPI processes launch in a certain order, but if you need a certain part of your application to run in the order of rank number, you can use a barrier like below:

#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
  MPI_Init(&argc, &argv);
  int i, nRanks, myRank;
  MPI_Comm_size(MPI_COMM_WORLD, &nRanks);
  MPI_Comm_rank(MPI_COMM_WORLD, &myRank);

  for(i = 0; i < nRanks; i++) {
    MPI_Barrier(MPI_COMM_WORLD);
    if (i == myRank) {
      printf("Hello World from Rank %d\n", myRank);
    }
  }

  return 0;
}
[andrey@alma-ata ~]$ mpiicc ordered.c
[andrey@alma-ata ~]$ mpirun -np 10 ./a.out 
Hello World from Rank 0
Hello World from Rank 1
Hello World from Rank 2
Hello World from Rank 3
Hello World from Rank 4
Hello World from Rank 5
Hello World from Rank 6
Hello World from Rank 7
Hello World from Rank 8
Hello World from Rank 9

 

0 Kudos
Andrey_Vladimirov
New Contributor III
2,946 Views

If you really must launch processes in the order of rank, you can write a wrapper script around your executable and use mpirun to launch that script. In the script, you can get the rank number from environment variable PMI_RANK and semaphore from one process to another using files in a shared directory. However, this will only time the launch of the executables. They will synchronize with each other at MPI_Init and from there on they will run in arbitrary order (unless you call barriers as in my previous post).

Example below:

nonordered.c:

#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
  printf("Code above MPI_Init will start in order (this is rank %s)\n", argv[1]);

  MPI_Init(&argc, &argv);
  int i, nRanks, myRank;
  MPI_Comm_size(MPI_COMM_WORLD, &nRanks);
  MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
  printf("MPI_Init has a barrier, so everything after it runs in parallel (this is rank %d)\n", myRank);
  MPI_Finalize();
  return 0;
}

 

order.sh:

#!/bin/bash

# Set to something else if running on a cluster
SHAREDPREFIX=/dev/shm/my-mpi-semaphore-

# A little safer than "rm -f ${SHAREDPREFIX}*". Imagine what would happen in SHAREDPREFIX=""
GARBAGE=`ls -1 ${SHAREDPREFIX}* 2>/dev/null`
if [ "$GARBAGE" != "" ]; then 
  rm -f $GARBAGE
fi

if [ $PMI_RANK == 0 ]; then
  # Rank = 0 starts immediately
  touch ${SHAREDPREFIX}${PMI_RANK}
  ./nonordered $PMI_RANK
else
  let PREVRANK=PMI_RANK-1
  # Spin until previous rank starts
  while [ ! -e ${SHAREDPREFIX}${PREVRANK} ]; do
    sleep 0.001
  done
  sleep 0.1 # Give time for previous rank to print
  touch ${SHAREDPREFIX}${PMI_RANK} # Semaphore to next rank
  ./nonordered $PMI_RANK # Run application
fi

 

Results:

[andrey@alma-ata ~]$ mpicc -o nonordered nonordered.c
[andrey@alma-ata ~]$ mpirun -np 10 ./order.sh
Code above MPI_Init will start in order (this is rank 0)
Code above MPI_Init will start in order (this is rank 1)
Code above MPI_Init will start in order (this is rank 2)
Code above MPI_Init will start in order (this is rank 3)
Code above MPI_Init will start in order (this is rank 4)
Code above MPI_Init will start in order (this is rank 5)
Code above MPI_Init will start in order (this is rank 6)
Code above MPI_Init will start in order (this is rank 7)
Code above MPI_Init will start in order (this is rank 8)
Code above MPI_Init will start in order (this is rank 9)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 1)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 3)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 4)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 7)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 8)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 0)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 2)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 5)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 6)
MPI_Init has a barrier, so everything after it runs in parallel (this is rank 9)

 

0 Kudos
ArthurRatz
Novice
2,946 Views

Andrey Vladimirov, thanks for your reply. Everything is working perfect under Linux.

On Windows, I have the same issue that processes are not launched in the order of rank.

Can you provide the workaround to this problem ?

0 Kudos
Kittur_G_Intel
Employee
2,946 Views

Hi Arthur,
I checked with the MPI team and confirmed that in this particular case (one node job) MPI processes are run in order user is asking for. Also, note that MPI processes don't achieve checkpoint point first from one process to another. Hence the user needs to sync up the MPI processes using MPI calls such as MPI_Barrier or any blocking operations when it comes to stdout or printing of the processes according to the rank order you're referring to.   So, Andrey is correct and the solution he provided should work.

_Kittur

0 Kudos
Andrey_Vladimirov
New Contributor III
2,946 Views

The solution with the barrier should work in Windows just as well as in Linux. If you see that in Windows the output is not in the order of rank, this is probably caused by output buffering, i.e., processes actually print in order, but their output is not collected in the same order. Perhaps something like fflush(stdout) after the printf() could help. I don't have a Windows installation to check.

I guess, in a real-world scenario one generally does not care about the order of output, but one may care about ordering some fraction of the calculations. In this case, barriers should do the job without any extra tricks.

0 Kudos
ArthurRatz
Novice
2,946 Views

Thanks a lot Kittur and Andrey for your replies. I've already found a workaround solution for this problem. Really, it's not necessary for processes to be launched in a specific order, because doing this using MPI_Barrier actually leads to the serious performance degrades .

Anyway thanks to everyone who could help.

0 Kudos
Kittur_G_Intel
Employee
2,946 Views

Pleasure, Arthur and thanks to Andrey for responding earlier with the correct input as well. Yes, using barrier obviously impedes performance and good to know you found some workaround.  -Kittur

0 Kudos
ArthurRatz
Novice
2,946 Views

O'key thanks.

0 Kudos
ArthurRatz
Novice
2,946 Views

Obviously that we cannot do it using mpiexec or mpirun utilities, instead we have to provide a processes synchronization.

0 Kudos
ArthurRatz
Novice
2,946 Views

Can we effectively use mpiexec -ordered-output and fprinf, fflush functions to provide an ordered output ?

0 Kudos
Kittur_G_Intel
Employee
2,946 Views

Hi Arthur, I'll transfer this forum thread to the MPI forum at: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology and they deal with MPI and should be able to confirm if that's a possible solution and will let the group know as well, thanks.

_kittur

0 Kudos
ArthurRatz
Novice
2,946 Views

O'key, no problem.

0 Kudos
Heinrich_B_Intel
Employee
2,946 Views

Hi Arthur,

for an ordered output you already found that -ordered-output does not help. I guess -ordered-output just makes sure that a line is not split. 

An ordered output can be achieved by printing the rank number before each line and sort: 

$  mpirun -prepend-pattern "%r  " -n 28 ./a.out | sort -g

On windows you may write the output in a file and sort inside the file with sed under cygwin or an appropriate windows tool.

cheers,

Heinrich

PS: there is also a -prepend-rank flag but you have to get rid of the "["

 

0 Kudos
ArthurRatz
Novice
2,946 Views

Thanks Heinrich for your reply it's very helpful. I've been searching this answer on web, but could not find it.

0 Kudos
Heinrich_B_Intel
Employee
2,946 Views

Hi Arthur, 

a colleague found that my suggestion could be improved. The reason is that "sort -g" sorts by the whole line. This could spoil an output with many differnt lines per rank. In order to make sure that sort acts only on the rank number you may specify:

$ mpirun -prepend-pattern ”%r ” … | sort -s -k1,1 –n

this will sort only by the rank number (first item) but leaves the lines for each rank in the right order! I also had a typo in my previous post because I meant "sort" instead of "sed" 

Heinrich

 

 

0 Kudos
ArthurRatz
Novice
2,946 Views

Thanks for your reply.

0 Kudos
ArthurRatz
Novice
2,946 Views

And one more question about the usage of -prepend-pattern option. Can you give me some examples of using this option or point me at the documentation that explains using this option ?

Thanks in advance.

Cheers, Arthur.

0 Kudos
Reply