Unable to run HPL on mic

girish_b_ · ‎01-19-2016

I am running hpl on mic node , but facing the error and run is stopped .

Error in scif_send 0: Success.

Loc_N_Intel · ‎01-19-2016

Hi Girish,

Would you like to provide details on how to reproduce the problem you saw. For example, what OS are you running, what MPSS version and compiler version are you using, where did you download the source code, how did you compile the code, and how did you run the executable?

Thanks

girish_b_ · ‎01-20-2016

Hi Loc,

I am using xhpl_offload binary from the intel cluster suite 2015. The os is centos 6.5 ,MPSS version is 3.4.3 ,i have intel cluster suite 2015 and using the pre compiled offload binary . I am running it using the script that is present in mkl-benchmarks directory.

Now i am able to run the hpl but the performance is very low it is 540GF ( Theoritical value 1.2 TF )for the following specifications.

problem size =64000, block size= 256, P*Q =1*2, and MPI_PER_NODE=2 since the sockets on host is 2 and memory on host is 102 GB.

Kindly help me to obtain optimized performance.

Loc_N_Intel · ‎01-28-2016

Hi Girish,

Sorry for the delay. I contacted an MKL expert and asked your question. The answer is the good performance of HPL can only be achieved using the latest version of the benchmark. MLK benchmarks are available in a package available at https://software.intel.com/en-us/articles/intel-mkl-benchmarks-suite

HPL is among the benchmarks contained in the package (please navigate to mp_linpack). Users need to read carefully the README and TUNNING files in order to get top performance measurement.

Please note that in the latest version of the package, runme_offload_intel64 no longer exists. It’s been absorbed into runme_intel64 and runme_intel64_dynamic. The usage model is “host only”, “native”, or “hybrid offload”. Therefore, both host and the coprocessor(s) will be used for “hybrid offload”. Sometimes, people see lower performance of “hybrid offload” than “native” or “host only”. This typically is the result of benchmark configuration problems, such as improper problem sizes, work distribution problems, etc.

Hope this help.