Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2226 Discussions

How to run HPL on one node with only 1 process and 1 thread?

oleotiger
Novice
4,666 Views

I'm running buildt-in HPL of intelAPI 2021.2.

mpirun -hostfile 1host -genvall -np 1 -perhost 1 -genv OMP_NUM_THREADS=1 ./xhpl_intel64_static

The PXQ is set 1. I wanna to run only 1 rank on 1 core with single thread running.

I tried:

-genv MKL_NUM_THREADS=1 -genv OMP_NUM_THREADS=1 -genv MKL_DOMAIN_NUM_THREADS=1 -genv MKL_DYNAMIC=FALSE -genv OMP_NESTED=TRUE -genv OMP_DYNAMIC=FALSE

But it seems that MKL_NUM_THREADS/MKL_DOMAIN_NUM_THREADS/OMP_NUM_THREADS all don't work. 

There are 2 sockets and 24 cores on each socket with hyperthreading disabled.

Once I start HPL, there is only one process xhpl_intel64_static but there are 48 threads on all the cores.

 

How can I run HPL with only 1 rank on 1 core with 1 thread?

0 Kudos
9 Replies
SantoshY_Intel
Moderator
4,648 Views

Hi,

 

Thanks for reaching out to us.

 

It is recommended to export the environment variables (or) use below format, Instead of using -genv option.

MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 MKL_DOMAIN_NUM_THREADS=1 MKL_DYNAMIC=FALSE OMP_NESTED=TRUE OMP_DYNAMIC=FALSE I_MPI_DEBUG=10 mpirun -np 1 -ppn 1 ./xhpl_intel64_static

>>"Once I start HPL, there is only one process xhpl_intel64_static but there are 48 threads on all the cores."

Could you please let us know how you confirmed that 48 threads were active across 48 cores when you launched only 1 process?

 

 

Thanks & Regards,

Santosh

 

 

0 Kudos
oleotiger
Novice
4,626 Views

I confirm that 48 cores are occupied with `htop`.

htop shows that 48 cores are all running with 100% utilization and each is running with a `xhpl_intel64_static` thread.

`pstree -p the_pid_of_xhpl_intel64_static` shows 48 threads.

0 Kudos
SantoshY_Intel
Moderator
4,576 Views

Hi,


>>"I'm running built-in HPL of intelAPI 2021.2."

Could you please confirm whether the binary "xhpl_intel64_static" is taken from intel OneAPI 2021.2 or from the older versions of the parallel studio 2019/2018?

>>"How can I run HPL with only 1 rank on 1 core with 1 thread?"

If your aim is to run Linpack in a sequential manner( with only 1 rank on 1 core with 1 thread), then we suggest you run just Linpack binaries, instead of HPL/mp_linpack binaries.


The following command will restrict execution to 1 core (i.e. core 0 in this case),

$ numactl --physcpubind=0 ./binary


Is there anything else that we could help you with?


--

Best Regards,

Santosh




0 Kudos
oleotiger
Novice
4,560 Views

Could you please confirm whether the binary "xhpl_intel64_static" is taken from intel OneAPI 2021.2 or from the older versions of the parallel studio 2019/2018?

I confirmed that  "xhpl_intel64_static" is taken from intel OneAPI 2021.2. I didn't know install any HLP/mp_linpack manually.


Is there anything else that we could help you with?

 I'm confused that why there are mutiple threads when I run 'xhpl_intel64_static' with setting 'MKL_NUM_THREADS=1 OMP_NUM_THREADS=1'.

Why these env variables do not work?

Or is this due to the feature of application 'xhpl_intel64_static'?

0 Kudos
SantoshY_Intel
Moderator
4,457 Views

Hi,

 

If you wish to run Linpack with only 1 rank on 1 core with 1 thread, then we suggest you run just Linpack binaries(can be found at /opt/intel/oneapi/mkl/latest/benchmarks/linpack), instead of HPL/mp_linpack binaries(can be found at /opt/intel/oneapi/mkl/latest/benchmarks/mp_linpack).

 

For running LINPACK binary using 1 rank on 1 core with 1 thread, please follow the below steps:

 

 

cd /opt/intel/oneapi/mkl/latest/benchmarks/linpack
I_MPI_DEBUG=10 mpirun -n 1 -ppn 1 ./runme_xeon64 

 

 

 Initially, running the above command will launch multiple threads as shown below:

SantoshY_Intel_0-1631168385600.png

 

 

MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 I_MPI_DEBUG=10 mpirun -n 1 -ppn 1 ./runme_xeon64 

 

Running the above command make sure that only 1 thread will be launched as shown in the screenshot below:

SantoshY_Intel_1-1631168430903.png

 

The Intel Distribution for LINPACK* Benchmark is based on modifications and additions to High-Performance LINPACK (HPL)(http://www.netlib.org/benchmark/hpl/) and can be used for benchmarking your cluster. 

 

Since you wanted to run the Intel® Distribution for LINPACK* Benchmark binary "xhpl_intel64_static" using 1 rank on 1 core with 1 thread,  could you please explain the use-case/intension behind it? So that it helps us to understand your scenario better and thus helps us to provide you better support.

 

Best Regards,

Santosh

0 Kudos
oleotiger
Novice
4,435 Views

I wanna to simulate different HPC workload on a server with HPL.

 

Run HPL on a single core is one scenario among the simulation.

 

The basic purpose is to control HPL run on different number of cores flexibly with each core running with the same pressure.

 

 

0 Kudos
SantoshY_Intel
Moderator
4,388 Views

Hi,


We are working on your issue internally and we will get back to you soon.


Best Regards,

Santosh


0 Kudos
Gennady_F_Intel
Moderator
4,378 Views

"I wanna to simulate different HPC workload on a server with HPL.

 Run HPL on a single core is one scenario among the simulation.

 The basic purpose is to control HPL run on different number of cores flexibly with each core running with the same pressure. "


As the SMP version of the LINPACK, as well as the MPI version (mp_linpack), solve similar tasks, You could run SMP Linpack with different #of threads/cores as Santosh has already shown you how to do that.


This query has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only. 



0 Kudos
Reply