- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Guys, if you have some time andcouldprovide some performancenumbers, obtained with any
version of MKL,I really appreciate it! If you can't... sorry that my post took a couple of seconds
of your valuabletime.
THIS IS WHAT I NEED:
I wonder if somebody, who has anMKL, could do a Performance Evaluation of aMatrix Multiplication function?
Test-Case:
- Both matrices2048 x 2048
- Data type 'float'
- All Elements Initialized to 1.0f
Please report aTime ( in secs )to Calculate aProduct of two matricesand somedetailsabout your CPU,
frequency, memory in GBs, etc.
I'm not interested in aresult of multiplication. I'm interested to know how longit takes to calculate it on
different computers with different CPUs using Intel'sMKL.
Thank you in advance.
Best regards,
Sergey
Lien copié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
We post a number of benchmarks on our website but we don't expect that it will ever cover all customer questions. There are simply too many permutations.
Even your question above, leads to some other question... What OS? Are matrices transposed or not? You say both matrices, so is the third matrix in SGEMM, "C" zeroed with beta equal to 0?
And then naturally, there will be required full documentation and disclaimers when Intel posts some benchmark number.
So you see, what seems like a simple request can become a slightly bigger request, so we do our best here to provide some representative performance numbers that give an indication of the kinds of results you can get with Intel MKL and then for the other cases we provide a free evaluation copy of the fully functional version of Intel MKL so that you can give it a try on the case that is important to you.
Todd
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
These benchmarks are in 'Gflops', not in'Seconds'.
>>Even your question above, leads to some other question... What OS?
Any OS. No special requirements andwhatever is best for you. A computer with a latest or
older ( 1 - 2 year old )IntelCPU would be OK.
>>Are matrices transposed or not?
No. All matrix elements are initialized to 1.0. Both matrices are square, 2048 by 2048, it means that
it doesn't matter if you transposesome matrixor not. It will be the same.
>>You say both matrices, so is the third matrix in SGEMM, "C" zeroed with beta equal to 0?
Here is a C-pseudo code:
...
float fA[2048][2048];// Matrix A
float fB[2048][2048]; // Matrix B
float fC[2048][2048]; // Matrix C
for( int i=0; i<2048; i++)
{
for( int j=0; j<2048; j++ )
{
fA
fB
fC
}
}
t1 = GetTime();
fC = < MKLMatrixMultiply >( fA, fB );// Any MKL version
t2 = GetTime();
Delta = t2 - t1; // Time to multiply (in seconds, for example )
...
As you can see I don't need something really special.
Best regards,
Sergey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Hi Sergey,
We report the performance numbersin flops (flop/sec), which is the number offloating point operations(flop)per second (sec). You can find the time required for a routine if you know flop and flop/sec.
For example, the number of floating point operations to compute SGEMM with M=N=K=2048,beta=0.0, alpha=1.0is given as:
2*M*N*K= 2*2048*2048*2048 = 17179869184 flop ~= 17.180 Giga-Flop (GFlop)
Now, if SGEMM runs at 200 GFlop/sec (or GFlops), then the time for SGEMM will be:
17.180 / 200 = 0.0859 secs
Double-precision GEMM (DGEMM) is shown on the performance charts, and as a rule-of-thumb, the single-precision performance is two times of the double-precision performance. Therefore, you can multiply the DGEMM GFlops by two to get an estimate of SGEMM GFlops.
Best wishes,
Efe
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Even if it issome kind of "calculated performance", not measured,it gives me better ideaabout performance of MKL.
I have a question. What is a number '2' in:
2*M*N*K= 2*2048*2048*2048 = 17179869184 flop ~= 17.180 Giga-Flop (GFlop)
^
Thank you for your time!
Best regards,
Sergey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Question1:
What modernIntel's CPUs provide such performance?
Question 2:
I also would like to compare performance gainsrelative tosome older Intel CPUs, for example
Pentium 4 or Atom N270. So, how fast are they in terms of number of floating point operations in a second?
Best regards,
Sergey
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Question1:
What modernIntel's CPUs provide such performance?
Question 2:
I also would like to compare performance gainsrelative tosome older Intel CPUs, for example
Pentium 4 or Atom N270. So, how fast are they in terms of number of floating point operations in a second?
Most of the recent new entries on Top500 are exceeding 200 Gflops DGEMM per node (2 CPUs) and 80% "efficiency" (actual vs. peak rated performance), and that is sustained for over 10000 cores.
This (for P4, Atom), .... has been covered many times over in public internet posts.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
It looks likeafamous T=O*(n^3) and O equals to '2'.
I'm not convinced that a classic (single-thread) algorithm for matrix multiplication is at the core of MLK's
SGEMM or DGEMM functions. I think Strassen or Strassen-Winograd algorithmshave to be used to boost a
speed ofcalculations.
- Marquer comme nouveau
- Marquer
- S'abonner
- Sourdine
- S'abonner au fil RSS
- Surligner
- Imprimer
- Signaler un contenu inapproprié
Thanks to everybody who responded to my posts.
Best regards,
Sergey

- S'abonner au fil RSS
- Marquer le sujet comme nouveau
- Marquer le sujet comme lu
- Placer ce Sujet en tête de liste pour l'utilisateur actuel
- Marquer
- S'abonner
- Page imprimable