- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Setting the I_MPI_PERHOST environment does not produce the expected behavior with codes built with IntelMPI v4.1.0.030, while codes built with IntelMPI v4.1.0.024 do. See below for a description of the problem. System OS is REDHAT linux v6.3.
If 16 MPI processes are to be placed on each node and I_MPI_PERHOST is set to 8, then the first 8 processes
should be placed on the first node, the second 8 processes should be placed on the
second node, etc., until all the assigned nodes have 8 processes, and then the next
8 processes are placed on the first node, etc.
The job has the "select" line:
#PBS -l select=4:ncpus=16:mpiprocs=16
so that 16 MPI processes are to be placed on each of four nodes. The job uses the
latest IMPI module "mpi/intelmpi/4.1.0.030". The job also sets:
export I_MPI_PERHOST=8
But instead of placing 8 MPI processes on each node and then cycling back in
round-robin mode, the first 16 processes are placed on the first node, the next 16
processes are placed on the second node, etc., as if I had not set the I_MPI_PERHOST
environment variable. A portion of the output is given below.
mpirun = .../compiler/intelmpi/4.1.0.030/bin64/mpirun
Nodes used:
n0006
n0028
n0008
n0011
Rank Processor Name
0 n0006
1 n0006
2 n0006
3 n0006
4 n0006
5 n0006
6 n0006
7 n0006
8 n0006
9 n0006
10 n0006
11 n0006
12 n0006
13 n0006
14 n0006
15 n0006
16 n0028
17 n0028
18 n0028
19 n0028
20 n0028
21 n0028
22 n0028
23 n0028
24 n0028
25 n0028
26 n0028
27 n0028
28 n0028
29 n0028
30 n0028
31 n0028
32 n0008
33 n0008
34 n0008
35 n0008
36 n0008
37 n0008
38 n0008
39 n0008
40 n0008
41 n0008
42 n0008
43 n0008
44 n0008
45 n0008
46 n0008
47 n0008
48 n0011
49 n0011
50 n0011
51 n0011
52 n0011
53 n0011
54 n0011
5 n0011
56 n0011
57 n0011
58 n0011
59 n0011
60 n0011
61 n0011
62 n0011
63 n0011
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi George,
What happens if you use the latest version of the Intel® MPI Library, Version 4.1 Update 1? If this is still showing the problem, please send the output with I_MPI_HYDRA_DEBUG=1. This will generate a lot of output, so please capture it in a file and attach it to your reply.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Was the problem resolved? I'm also seeing similar problem. We have IntelMPI v.4.1.0.024 and v.5.0.1.035. With the older version mpirun option -perhost works as expected, but it doesn't work with the newer version:
$ qsub -I -lnodes=2:ppn=16:compute,walltime=0:15:00
qsub: waiting for job 5731.hpc-class.its.iastate.edu to start
qsub: job 5731.hpc-class.its.iastate.edu ready
$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-40.its.iastate.edu
$ export I_MPI_ROOT=/shared/intel//impi/4.1.0.024
$ PATH="${I_MPI_ROOT}/intel64/bin:${PATH}"; export PATH
$ mpirun -n 2 -perhost 1 uname -n
hpc-class-40.its.iastate.edu
hpc-class-39.its.iastate.edu
As James suggested, I issued the same commands (for IntelMPI v.5.0.1.035) with I_MPI_HYDRA_DEBUG set to 1 (see attached file). What is interesting is that the first two lines of the output suggest that -perhost works (two different hostnames are printed), but at the end it's still printing the same hostname twice.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Marina,
In your case, it looks like the PBS* environment is overriding the -perhost option. Can you run outside of PBS*?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page