Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Ian_K_1
Beginner
87 Views

MKL HPCG Benchmark -- Missing MKL support and Incorrect Usage

I downloaded the MKL Benchmark set (for Linux) today, from:

  https://software.intel.com/en-us/articles/intel-mkl-benchmarks-suite

But the HPCG benchmark I wanted to run doesn't appear to actually support MKL -- the printed usage options list:

--mkl=yes/no: use MKL (default no)

but this doesn't appear to be implemented in the code, and trying to use it with the binaries gives:

$ ./xhpcg_avx --mkl=yes
./xhpcg_avx: unrecognized option '--mkl=yes'
$ ./xhpcg_avx --mkl yes
./xhpcg_avx: unrecognized option '--mkl'

Has anyone heard anything about whether this is going to be implemented? It seems odd to have it listed in a released download but still returning an error.

0 Kudos
4 Replies
Gennady_F_Intel
Moderator
87 Views

regardless this option has been used or not, the xhpcg_avx will produce the correct result. You may find out this output into the same directory. Please have a look. we will update the list of supported command line options of this benchmark.

Paul_I_
Beginner
87 Views

I am getting wildly varying results when running hpcg on one of our servers.

/opt/intel/16.0/compilers_and_libraries_2016.1.150/linux/mkl/benchmarks/hpcg/bin/xhpcg_avx -mkl=yes  --yaml="`hostname`.`date '+%Y%m%d%H%M%S'`"


GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 15.0287
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 13.123
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 13.9009
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 12.5201
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 11.5567
GFLOP/s Summary:
  HPCG result is VALID with a GFLOP/s rating of: 7.14771
pimmaraj@hpchost1:~/hpcg_mkl/hpcg_intel>

 

DDOT Timing Variations:
  Min DDOT MPI_Allreduce time: -7.44059e+10
  Max DDOT MPI_Allreduce time: -7.44059e+10
  Avg DDOT MPI_Allreduce time: -7.44059e+10
__________ Final Summary __________:

 

Gennady_F_Intel
Moderator
87 Views

Paul,  are you aware the HPCG works alone at this moment at this time?

 

McCalpinJohn
Black Belt
87 Views

Good implementations of HPCG give performance results that are very strongly correlated with the results of the STREAM benchmark.   Memory-bandwidth-limited codes are very sensitive to NUMA issues on multi-socket servers, so they should always be run with some sort of process binding, and (when possible) should also use memory binding.

I have not looked at this implementation of the HPCG benchmark, so I don't know which parallel libraries are involved, but investigation of the corresponding environment variables may be useful.

Reply