Hi everyone,
I've compiled HPL with Intel's MPI and Intel MKL but I'm confused about the best way to run it on a single node with two processors (12 cores each). Should I be using mpirun even though I only have a single node? If I try to set my PxQ grid to be 24 then I get an error saying that there are not enough processes. This is alleviated when I use mpirun. But I'm worried about communication overhead. Is it really necessary to use MPI when I've got a single node but with two processors? One of the other things I was confused about was what I should set the mkl option as when linking the MKL Library (cluster vs parallel). I've chosen parallel for the time being.
Any thoughts?
Thanks for Reading
連結已複製
Hi,
This HPL requires 1 MPI process per NUMA node to utilize full bandwidth for main memory and performance will be higher with large problem sizes. Please set PxQ=2x1 and launch two MPI processes with script (runme_intel64_{static, dynamic}). If problem size is small, communication cost will be major bottleneck and performance will be lower. In this case, you could try 1 MPI process instead of two MPI processes.