Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

mpirun and lsf 8

José_Luis_G_1
Beginner
671 Views

mpirun  -ppn # creates # process per node.

It works fine with ssh

??--[ 15:05:57 ]--\> $ mpirun -hosts "mn1,mn2" -ppn 2 -np 4 ~/a.out
Hello world!I'm 0 of 4 on mn1
Hello world!I'm 2 of 4 on mn2
Hello world!I'm 1 of 4 on mn1
Hello world!I'm 3 of 4 on mn2

however, It doesn't with LSF (-n 32 will assign two nodes with 16 procs to the job):

??--[ 16:39:41 ]--\> $ bsub -q pruebas -n 32 -I mpirun -ppn 2 -np 4 ~/a.out

Job <29169> is submitted to queue <pruebas>.
<<Waiting for dispatch ...>>
<<Starting on mn339>>
Hello world!I'm 0 of 4 on mn339
Hello world!I'm 1 of 4 on mn339
Hello world!I'm 2 of 4 on mn339
Hello world!I'm 3 of 4 on mn339

mpirun does detect it is under an LSF job.

any ideas?

0 Kudos
3 Replies
James_T_Intel
Moderator
671 Views

Hi Jose,

What version of the Intel® MPI Library are you using?  I would recommend using the latest, Version 4.1 Update 1, as there are improvements to compatibility with LSF* in this version.

You can try using the ptile capability of LSF* to specify how many ranks per host should be run.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
drMikeT
New Contributor I
671 Views

James Tullos (Intel) wrote:

Hi Jose,

What version of the Intel® MPI Library are you using?  I would recommend using the latest, Version 4.1 Update 1, as there are improvements to compatibility with LSF* in this version.

You can try using the ptile capability of LSF* to specify how many ranks per host should be run.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Hi james, 

how does IntelMPI 4.1 U1 launch tasks under LSF? Does it use the native LSF mechanism to talk to LSF deamons which themsleves create the tasks on the target nodes ? Or just simply reads a host-file prepared by LSF and SSHs there? 

We have a vert large LSF shop here and the issue of integration of MPI code with the scheduler is important (for suspending or killing jobs and for accurately counting CPU and memory usage per process/rank).

thanks

Michael

0 Kudos
José_Luis_G_1
Beginner
671 Views

James

thank you for your message. I have 4.1.0 (I'll update as soon as possible).

About your suggestion of ptile, it's a very good one and it works in some cases. However, I want to use all the processors of the nodes, but just start an smaller number of MPI process (the other processors will run threads of the MPI process).

0 Kudos
Reply