Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6981 Discussions

MKL HPCG Benchmark -- Missing MKL support and Incorrect Usage

Ian_K_1
Beginner
554 Views

I downloaded the MKL Benchmark set (for Linux) today, from:

  https://software.intel.com/en-us/articles/intel-mkl-benchmarks-suite

But the HPCG benchmark I wanted to run doesn't appear to actually support MKL -- the printed usage options list:

--mkl=yes/no: use MKL (default no)

but this doesn't appear to be implemented in the code, and trying to use it with the binaries gives:

$ ./xhpcg_avx --mkl=yes
./xhpcg_avx: unrecognized option '--mkl=yes'
$ ./xhpcg_avx --mkl yes
./xhpcg_avx: unrecognized option '--mkl'

Has anyone heard anything about whether this is going to be implemented? It seems odd to have it listed in a released download but still returning an error.

0 Kudos
4 Replies
Gennady_F_Intel
Moderator
554 Views

regardless this option has been used or not, the xhpcg_avx will produce the correct result. You may find out this output into the same directory. Please have a look. we will update the list of supported command line options of this benchmark.

0 Kudos
Paul_I_
Beginner
554 Views

I am getting wildly varying results when running hpcg on one of our servers.

/opt/intel/16.0/compilers_and_libraries_2016.1.150/linux/mkl/benchmarks/hpcg/bin/xhpcg_avx -mkl=yes  --yaml="`hostname`.`date '+%Y%m%d%H%M%S'`"


GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 15.0287
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 13.123
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 13.9009
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 12.5201
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 11.5567
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 7.14771
pimmaraj@hpchost1:~/hpcg_mkl/hpcg_intel>

 

DDOT Timing Variations:
  Min DDOT MPI_Allreduce time: -7.44059e+10
  Max DDOT MPI_Allreduce time: -7.44059e+10
  Avg DDOT MPI_Allreduce time: -7.44059e+10
__________ Final Summary __________:

 

0 Kudos
Gennady_F_Intel
Moderator
554 Views

Paul,  are you aware the HPCG works alone at this moment at this time?

 

0 Kudos
McCalpinJohn
Honored Contributor III
554 Views

Good implementations of HPCG give performance results that are very strongly correlated with the results of the STREAM benchmark.   Memory-bandwidth-limited codes are very sensitive to NUMA issues on multi-socket servers, so they should always be run with some sort of process binding, and (when possible) should also use memory binding.

I have not looked at this implementation of the HPCG benchmark, so I don't know which parallel libraries are involved, but investigation of the corresponding environment variables may be useful.

0 Kudos
Reply