Hi Nikolay,

Menyaylov__Nikolay · ‎09-20-2018

Good day!

I have a weird problem with the Xeon Phi x200 (KNL) performance on the mp_linpack test (HPLinpack 2.1) from the set of MKL benchmarks of Intel Parallel Studio XE 2017.

If I run the test on 1 KNL, I get the result of 1.64 Tflops (attachment 1KNL.xhpl.out)
If I run on two KNLs, the result is 1.85 Tflops instead of ~ 3.2 Tflops expected (attachment 2KNL.xhpl.out).
But if I run the test on three KNL, I get only 1.68 Tflops instead of ~ 4.8 Tflops expected! (attachment 3KNL.xhpl.out)

What am I doing wrong?
Please help me to find a solution. Thanks.

Preloaded variables:
source /opt/intel/mkl/bin/mklvars.sh intel64
export MKL_ENABLE_INSTRUCTIONS = AVX
export HPL_MIC_EXQUEUES = 64
export HPL_MIC_DEVICE = [0 | 0,1 | 0,1,2]

Some additional information:

OS: SUSE Linux Enterprise Server 12 SP2 (kernel 4.4.21-69)
2x Intel (R) Xeon (R) Silver 4108 CPU @ 1.80GHz
128GB DDR4 ECC
Intel SSD 240GB
PCI bridge: PLX Technology, Inc. PEX 8747 48-Lane, 5-Port PCI Express Gen 3 (8.0 GT / s)

lspci -vv (same for each KNL):
LnkCap: Port # 9, Speed 8GT / s, Width x16, ASPM L1, Exit Latency L0s <512ns, L1 <16us
LnkSta: Speed 8GT / s, Width x16, TrErr- Train- SlotClk + DLActive + BWMgmt + ABWMgmt-

Khang_N_Intel · ‎10-02-2018

Hi Nikolay,

This issue has been sent to the MKL team for investigation.

I will let you know as soon as I receive an update about this issue.

Best Regards,

Khang

mp_linpack results problem