Software Archive
Read-only legacy content
17061 Discussions

Symmetric MPI run

George_H_
Beginner
1,691 Views

Hello,

I'm trying to run an mpi symmetric model on the host and mic architecture. Everything works fine as long as the total number of processes (mic+host) is less 10. But when it's greater than 10, I get the attached error.

This is my mpirun command:

mpirun -v -host p-linux -check-mpi -env I_MPI_DEBUG 5 -genv I_MPI_FABRICS shm:tcp -np 1 <host executable> : -host mic0 -iface mic0 -env I_MPI_DEBUG 5 -env LD_LIBRARY_PATH=/opt/intel/composer_xe_2013.5.192/compiler/lib/mic:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/mic/coi/device-linux-release/lib:/opt/intel/mic/myo/lib:/opt/intel/composer_xe_2013.5.192/compiler/lib/mic:/opt/intel/composer_xe_2013.5.192/mkl/lib/mic:/opt/intel/composer_xe_2013.5.192/tbb/lib/mic  -np 11  <mic executable>

Here the total number of processes is 11+1=12 which is greater than 10 and so I get the error. If it's less than 10 the program executes correctly.

I noticed that in the bottom part of the above error message, i.e. :

[proxy:0:1@p-linux-mic0] got pmi command (from 30): put
kvsname=kvs_9321_0 key=P10-businesscard-0 value=description#

there is no $port ... $ifname after the "value=description#" part, which is not the case for other processes.

Thanks

 

 

 

 

 

 

0 Kudos
5 Replies
George_H_
Beginner
1,691 Views

Just for an update, I ran the mpi test case, and got exactly the same problem.

 

mpirun -host p-linux -env I_MPI_DEBUG 5 -genv I_MPI_FABRICS shm:tcp -np 1 -check_mpi ./a.host : -host mic0 -iface mic0 -env I_MPI_DEBUG 5 -np 10  ./a.mic

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
[0] MPI startup(): shm and tcp data transfer modes
[6] MPI startup(): shm and tcp data transfer modes
[5] MPI startup(): shm and tcp data transfer modes
[1] MPI startup(): shm and tcp data transfer modes
[4] MPI startup(): shm and tcp data transfer modes
[7] MPI startup(): shm and tcp data transfer modes
[10] MPI startup(): shm and tcp data transfer modes
[2] MPI startup(): shm and tcp data transfer modes
[3] MPI startup(): shm and tcp data transfer modes
[8] MPI startup(): shm and tcp data transfer modes
[9] MPI startup(): shm and tcp data transfer modes
[mpiexec@p-linux] handle_pmi_cmd (./pm/pmiserv/pmiserv_cb.c:78): Unrecognized PMI command: k | cleaning up processes
[mpiexec@p-linux] control_cb (./pm/pmiserv/pmiserv_cb.c:868): unable to process PMI command
[mpiexec@p-linux] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@p-linux] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:435): error waiting for event
[mpiexec@p-linux] main (./ui/mpich/mpiexec.c:901): process manager error waiting for completion

But if I run if for fewer processes, I get the correct result:

 

mpirun -host p-linux -env I_MPI_DEBUG 5 -genv I_MPI_FABRICS shm:tcp -np 1 -check_mpi ./a.host : -host mic0 -iface mic0 -env I_MPI_DEBUG 5 -np 5  ./a.mic
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
[0] MPI startup(): shm and tcp data transfer modes
[4] MPI startup(): shm and tcp data transfer modes
[3] MPI startup(): shm and tcp data transfer modes
[5] MPI startup(): shm and tcp data transfer modes
[2] MPI startup(): shm and tcp data transfer modes
[1] MPI startup(): shm and tcp data transfer modes
[0] MPI startup(): Rank    Pid      Node name            Pin cpu
[0] MPI startup(): 0       894      p-linux       {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
                                           30,31,32,33,34,35,36,37,38,39}
[0] MPI startup(): 1       5235     p-linux-mic0  {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
                                           ,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45}
[0] MPI startup(): 2       5236     p-linux-mic0  {46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72
                                           ,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90}
[0] MPI startup(): 3       5237     p-linux-mic0  {91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,1
                                           13,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,1
                                           33,134,135}
[0] MPI startup(): 4       5238     p-linux-mic0  {136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,
                                           156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,
                                           176,177,178,179,180}
[0] MPI startup(): 5       5239     p-linux-mic0  {0,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,20
                                           0,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,22
                                           0,221,222,223,224}
[0] MPI startup(): I_MPI_DEBUG=5
[0] MPI startup(): I_MPI_FABRICS=shm:tcp
[0] MPI startup(): I_MPI_MIC=1
[0] MPI startup(): I_MPI_PIN_MAPPING=1:0 0
 Hello world: rank            0  of            6  running on
 p-linux                                                                 
                                                 
 Hello world: rank            1  of            6  running on
 p-linux-mic0                                                            
                                                 
 Hello world: rank            2  of            6  running on
 p-linux-mic0                                                            
                                                 
 Hello world: rank            3  of            6  running on
 p-linux-mic0                                                            
                                                 
 Hello world: rank            4  of            6  running on
 p-linux-mic0                                                            
                                                 
 Hello world: rank            5  of            6  running on
 p-linux-mic0                                                            

 

0 Kudos
Loc_N_Intel
Employee
1,691 Views

Hi George,

- Could you run your application on host only (without coprocessor) with more than 10 ranks? Do you see the same error?

- Repeat the test again on the coprocessor only (without host) with more than 10 ranks? Do you see the same error?

0 Kudos
George_H_
Beginner
1,691 Views

Hi

Everything works fine when I run on host or phi only. The problem in is the symmetric execution. I'm using the Intel mpi version 4.1.3.

0 Kudos
Frances_R_Intel
Employee
1,691 Views

Is there a MaxSessions specified in /etc/ssh/sshd_config?

0 Kudos
Loc_N_Intel
Employee
1,691 Views

Hi George,

You use the option "-iface" only in your command when you want to specify a network interface. In your case, you don't need to specify mic0. Therefore, can you retry your command without "-iface mic0" and observe if you still have that error:

# mpirun -host p-linux -env I_MPI_DEBUG 5 -genv I_MPI_FABRICS shm:tcp -np 1 -check_mpi ./a.host : -host mic0 -env I_MPI_DEBUG 5 -np 10  ./a.mic

0 Kudos
Reply