I downloaded the MKL Benchmark set (for Linux) today, from:
But the HPCG benchmark I wanted to run doesn't appear to actually support MKL -- the printed usage options list:
--mkl=yes/no: use MKL (default no)
but this doesn't appear to be implemented in the code, and trying to use it with the binaries gives:
$ ./xhpcg_avx --mkl=yes ./xhpcg_avx: unrecognized option '--mkl=yes' $ ./xhpcg_avx --mkl yes ./xhpcg_avx: unrecognized option '--mkl'
Has anyone heard anything about whether this is going to be implemented? It seems odd to have it listed in a released download but still returning an error.
regardless this option has been used or not, the xhpcg_avx will produce the correct result. You may find out this output into the same directory. Please have a look. we will update the list of supported command line options of this benchmark.
I am getting wildly varying results when running hpcg on one of our servers.
/opt/intel/16.0/compilers_and_libraries_2016.1.150/linux/mkl/benchmarks/hpcg/bin/xhpcg_avx -mkl=yes --yaml="`hostname`.`date '+%Y%m%d%H%M%S'`"
HPCG result is VALID with a GFLOP/s rating of: 15.0287
HPCG result is VALID with a GFLOP/s rating of: 13.123
HPCG result is VALID with a GFLOP/s rating of: 13.9009
HPCG result is VALID with a GFLOP/s rating of: 12.5201
HPCG result is VALID with a GFLOP/s rating of: 11.5567
HPCG result is VALID with a GFLOP/s rating of: 7.14771
DDOT Timing Variations: Min DDOT MPI_Allreduce time: -7.44059e+10 Max DDOT MPI_Allreduce time: -7.44059e+10 Avg DDOT MPI_Allreduce time: -7.44059e+10 __________ Final Summary __________:
Good implementations of HPCG give performance results that are very strongly correlated with the results of the STREAM benchmark. Memory-bandwidth-limited codes are very sensitive to NUMA issues on multi-socket servers, so they should always be run with some sort of process binding, and (when possible) should also use memory binding.
I have not looked at this implementation of the HPCG benchmark, so I don't know which parallel libraries are involved, but investigation of the corresponding environment variables may be useful.