Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Simon_H_
Beginner
110 Views

How to run Intel mp_linpack pre-compiled with Hyper-threading enabled?

Hello all!

I have managed to successfully run mp_linpack on my cluster with hyperthreading disabled.  All cores were running at 100%.

I want to experiment and run with hyperthreading enabled.  The runme_intel64 seems to run the threads only on physical cores.

I was looking through the Intel MPI reference guide and I tried a few parameters, but they didn't help.  Anyone have any idea?

I have tried to put these lines in runme_intel64:

export I_MPI_PIN=on

export I_MPI_PIN_CELL=unit

export I_MPI_PIN_DOMAIN=auto

Whatever it is, it just seems like there is a default setting used by the runme_intel64 or mp_linpack intel binary to run on only physical cores.

Any help is appreciated!

Cheers!

Simon.

0 Kudos
3 Replies
TimP
Black Belt
110 Views

Yes, default is set for maximum performance.  You may be able simply to increase number of mpi processes, or, better, run the hybrid with more threads per rank and -genv mkl_dynamic=false.

Simon_H_
Beginner
110 Views

Thanks Tim,

I added the -genv mkl_dynamic=false inside runme_intel64.  However, doesn't seem like it affected anything.  In htop, only processor 1 to 32 shows 100%, while 33 to 64 shows idle.  Do you know if the binary is compiled to do that?  Or is there an actual MPI environment parameter that can be set?  Also, how does intel mpi on the system knows either to map 32 theads or 64 threads?

1st Run: MPI_PROC_NUM=1, MPI_PER_NODE=1, P 1 Q 1, 32 process runs, cpu 1 - cpu 32 100%, cpu 33 - cpu 64 idle

2nd Run: MPI_PROC_NUM=2, MPI_PER_NODE=1, P 2 Q 2, 64 process runs, cpu 1 - cpu 32 100%, cpu 33 - cpu 64 idle

so it looks like intel mpi skips cpu 33 - cpu 64 and run another 32 process on cpu 1 - cpu 32.

I ran the intel precompiled single system linpack and all 64 cpu shows 100%...  Any clue?

Thanks!

Simon.

James_T_Intel
Moderator
110 Views

To check the Intel® MPI Library pinning, run with I_MPI_DEBUG=4.  This will show which ranks are pinned to which cores.

Reply