Software Archive
Read-only legacy content
17061 Discussions

LINPACK results

Fuad_O_
Beginner
1,400 Views

Hi,

i have a Xeon Phi 5110P card. To test the setup i use the LINPACK benchbark proveded by intel: http://software.intel.com/en-us/articles/intel-math-kernel-library-linpack-download

1. Copy runme_mic, xlinpack_mic and lininput_mic from benchmark/linpack to mic

2. Copy libiomp5.so to mic

3. Execute runme_mic in native mode output:

This is a SAMPLE run script for SMP LINPACK. Change it to reflect
the correct number of CPUs/threads, problem input files, etc..
Mon Mar 24 14:35:54 CET 2014
Intel(R) Optimized LINPACK Benchmark data

Current date/time: Mon Mar 24 14:35:54 2014

CPU frequency:    1.053 GHz
Number of CPUs: 1
Number of cores: 240
Number of threads: 240

Parameters are set to:

Number of tests: 14
Number of equations to solve (problem size) : 2048  4096  6144  8192  10240 12288 14336 16384 18432 20480 22528 24576 26624 28672
Leading dimension of array                  : 2112  6208  6208  8256  10304 12352 14400 18496 18496 20544 22592 26688 26688 28736
Number of trials to run                     : 3     3     3     3     3     3     3     3     3     3     3     3     3     3    
Data alignment value (in Kbytes)            : 4     4     4     4     4     4     4     4     4     4     4     4     4     4    

Maximum memory requested that can be used=6591927552, at the size=28672

=================== Timing linear equation system solver ===================

Size   LDA    Align. Time(s)    GFlops   Residual     Residual(norm) Check
2048   2112   4      5.925      0.9679   4.795780e-12 3.950479e-02   pass
2048   2112   4      0.073      78.5080  4.795780e-12 3.950479e-02   pass
2048   2112   4      0.243      23.6291  4.795780e-12 3.950479e-02   pass
4096   6208   4      0.612      74.9120  2.216840e-11 4.613649e-02   pass
4096   6208   4      0.203      226.0199 2.216840e-11 4.613649e-02   pass
4096   6208   4      0.758      60.4580  2.216840e-11 4.613649e-02   pass
6144   6208   4      0.888      174.2260 3.562570e-11 3.301736e-02   pass
6144   6208   4      0.444      348.0285 3.562570e-11 3.301736e-02   pass
6144   6208   4      0.446      347.1975 3.562570e-11 3.301736e-02   pass
8192   8256   4      1.686      217.4767 7.232445e-11 3.782865e-02   pass
8192   8256   4      1.437      255.0615 7.232445e-11 3.782865e-02   pass
8192   8256   4      1.408      260.3055 7.232445e-11 3.782865e-02   pass
10240  10304  4      6.238      114.7933 1.010026e-10 3.389721e-02   pass
10240  10304  4      1.379      519.2215 1.010026e-10 3.389721e-02   pass
10240  10304  4      1.379      519.4054 1.010026e-10 3.389721e-02   pass
12288  12352  4      2.611      473.8617 1.454923e-10 3.393283e-02   pass
12288  12352  4      2.206      560.8873 1.454923e-10 3.393283e-02   pass
12288  12352  4      2.205      561.0014 1.454923e-10 3.393283e-02   pass
14336  14400  4      4.327      454.0611 2.006193e-10 3.448820e-02   pass
14336  14400  4      3.175      618.8803 2.006193e-10 3.448820e-02   pass
14336  14400  4      3.176      618.5288 2.006193e-10 3.448820e-02   pass
16384  18496  4      7.426      394.9037 2.524725e-10 3.324476e-02   pass
16384  18496  4      4.521      648.7009 2.524725e-10 3.324476e-02   pass
16384  18496  4      4.525      648.0508 2.524725e-10 3.324476e-02   pass
./runme_mic: line 44: 13324 Killed                  ./xlinpack_$arch lininput_$arch
Done: Mon Mar 24 14:40:22 CET 2014

 

My maximum is at 648 GFLOPS. The only reference i found is 769 GFLOPS: https://www-ssl.intel.com/content/www/us/en/benchmarks/xeon-phi-product-family-performance-brief.html

Was the result created with a different input file? Do i need to check my setup or is my result normal?

0 Kudos
5 Replies
TaylorIoTKidd
New Contributor I
1,400 Views

Hi Faud,

We are looking at your question and will get back to you soon.

Regards
--
Taylor
 

0 Kudos
Zhang_Z_Intel
Employee
1,400 Views

The output shows the benchmarking didn't run to completion. It is killed after completing size 16384, which is roughly half way through the entire range of sizes. Check if your KNC 5110P coprocessor has enough memory, and make sure there aren't other applications or zombies running when you run LINPACK. The performance reported by Intel was obtained on a coprocessor with 8GB memory.

 

0 Kudos
Phil_R_
Beginner
1,400 Views

libiomp5.so seems to be getting much harder to find for running Intel linpack on Phi these days.  The first 2.8GB trial I downloaded to obtain a copy (system studio, I believe) had a version of libiomp5 which could not be loaded on MIC; hopefully the next 2.8GB download (composer c++) will suffice.

It seems unlikely that the Intel linpack distribution will ever include libiomp5 for Phi, so would it be possible to some day have a release that does not require libiomp5 to execute on the Phi in the first place?

0 Kudos
TaylorIoTKidd
New Contributor I
1,400 Views

What version of the compiler do you have? I'm looking at Composer XE 15.0.090 and see lib/mic/libiomp5.so.

I may not understand your issue. Can you elaborate?

Regards
--
Taylor
 

0 Kudos
JJK
New Contributor III
1,400 Views

it is also possible to use the Redistributable Library Package, so that you don't have to install the Intel C compiler suite on every host. For more info on this, see this thread:   https://software.intel.com/en-us/forums/topic/534589

0 Kudos
Reply