I did a test a performance my embedded board, which has a Xeon D 1.7GHz(8-cores, 12M cache) and DDR 32GB.
A test tool was Intel Optimized LINPACK Benchmark in MKL 2018. When running runme_xeon64, a result was about 145 GFLOPS.
I expected about 400GFLOPS or more. I would like to know the result is reasonable, and then how to record for maximum performance with the test tool.