Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

question about Hybrid MPI/OpenMP

Laoya__Tang
Beginner
899 Views

Dear all,

I run the program by the following command:

mpiexec -wdir z:\directional -mapall -hosts 10 n01 5 n02 5 n03 5 n04 5 n05 5 n06 5 n07 5 n08 5 n09 5 n10 5 test

The cluster has 10 nodes with 24 logical cores ( 2*Intel(R) Xeon(R) CPU x5675) on every node. The program test have openMP based parallel calculation in some part, but also a considerable part is not parallelized. However, the problem is that the program 'test' only use 4 cores when running in parallel part (total CPU usage is only 80%), I noticed that when set I_MPI_PIN_DOMAIN=omp, every process 'test' will use all 24 cores. I have tested the program 'test' on one node by

mpiexec -wdir z:\directional -mapall -n 5 test

The program 'test' runs what I wanted (total CPU usage is 100% when in parallel part).


Now the problem is that the first command failed after I set I_MPI_PIN_DOMAIN=omp:

Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(658)................:
MPID_Init(195).......................: channel initialization failed
MPIDI_CH3_Init(104)..................:
MPID_nem_tcp_post_init(345)..........:
MPID_nem_newtcp_module_connpoll(3102):
gen_read_fail_handler(1196)..........: read from socket failed - The specified network name is no longer available.

What should I do to let the program use 100% CPU on every node?

Thanks,

Zhanghong Tang

0 Kudos
3 Replies
TimP
Honored Contributor III
899 Views

omp_num_threads=4

0 Kudos
Laoya__Tang
Beginner
899 Views

HI TimP,

Thanks for your quick reply. I have already set omp_num_threads=24 on every node which means that the parallel part of the program can use 100% CPU resource.

Thanks,

Zhanghong Tang

0 Kudos
Laoya__Tang
Beginner
899 Views

HI TimP,

I have tested on one node by

mpiexec -n 5 test

with and without setting I_MPI_PIN_DOMAIN=omp. If don't set that, the total CPU usage is only 80% and if set that, the total CPU usage is 100%. The program runs a little faster when setting I_MPI_PIN_DOMAIN=omp.

Now the problem is that the mpiexec can't work for 10-node cluster after set I_MPI_PIN_DOMAIN=omp.

Thanks,

Zhanghong Tang

0 Kudos
Reply