Intel Math Kernel Library LINPACK

Travis_Williams · ‎04-12-2010

Okay so I have been searching the internet trying to find a linpack program to test my 980x cpu's stability and while Prime95 works for testing I had great success using linpack on my previous 950 cpu with the LinX free sofware, but when I attempt to use LinX to test with the 980X cpu I found that theGFlops output by that software are nowhere near the expected GFlops of my 980X with hyperthreading. So I found this softwarethinking wellIntel will have there program ready to work with there own CPU but when running the software I found some issues. first off I input a problem size of 25000 with an LDA of 25000 and allignment of 4the program then tels me it is seeing 12 CPU'sand using 12 threads soo I expect 100%CPU usage and it uses 100% to load the problem then when calculating the problem drops down to 50% usage then jumps back to 100% usage to unload the problem and give results. Now I am not a programmer and truthfully just consider myself an enthusiast for building/overclocking computers but I'm pretty sure that a linpack program is supposed to use 100% of cpu so can you guys tell me if it is something I am doing wrong or is there no linpack program out as of now that iscompatible with 980X CPU withHyperthreading on?
On a side note my Gflops on this program where still higher than what the LinX software was outputting

Here is a scrrenshot of the program running on my systemhttp://img72.imageshack.us/img72/204/linpack.jpg

Any help or insight into this would be greatly appreciated.

TimP · ‎04-12-2010

Did you read earlier posts on this forum explaining how MKL will use 1 thread per core, by default? That you can set the environment variable to over-ride this choice so as to see how much you lose, assuming you are running MKL?
http://software.intel.com/en-us/forums/showthread.php?t=67195
MKL is written so as to keep the floating point parallel multiplier and adder running at full performance when running 1 thread per core. Ideally, 2 threads per core could run nearly the same speed, if neither requires more than 50% cache capacity nor more than 50% of the fill buffers. In order to satisfy the last requirement, the program would have to be de-optimized to the extent that it interleaves stores to no more than 4 array sections per thread. Affinity setting (KMP_AFFINITY) becomes more critical with 2 threads per core.

You didn't say what you meant by "expected Gflops." Quoted performance and efficiencies of Intel CPUs in the published linpack ratings normally would be achieved with HyperThreading disabled, as the peak floating point performance would be based on 1 double precision parallel multiply and 1 add per clock cycle per core, without allowance for interference between HyperThreads.
As to Turbo Boost, it might give you a small gain when HT is disabled, but could be expected to show a reduced "efficiency," if the efficiency assumes a linear speedup, or if you are considering an efficiency based on power consumption.