Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

SVD performance

steffenroeber
Beginner
326 Views
Hi,
up to now I used the ippsSVD_64f_D2 function from IPP 6.1. Now I migrated to IPP 7.0 where no SVD is available. So I tried to use the svd from Lapack. On the one hand I used the defaut Lapack on the other hand I used the MKL interface. All three approaches gave me same results. My input is a 3x3 matrix. The problem is the performance. IPP 6.1 takes about 3 seconds for 1000000 calls to ippsSVD_64f_D2.
The other both implementations take about 8 seconds for 1000000 calls to LAPACKE_dgesvd.
This is 3 times slower. Does anybody know where could be a reson?
0 Kudos
3 Replies
Sridevi_A_Intel
Employee
326 Views

Hi,

Could you please let me know which version of MKL did you use?Also, can you please provide me a testcase and also how did you measure the performance? Please provide meyour System configurations and CPU processor Information.

Thanks,
Sridevi

0 Kudos
IDZ_A_Intel
Employee
326 Views

IPP is designed for the smaller sizes and it would be beneficial when your number of iterations are very high as in your case, and MKL performs better for very large data sizes. That should be the reason you see the difference in performance between the two libraries.

--Vipin

0 Kudos
steffenroeber
Beginner
326 Views
HI, here is the example code. I'm using the newest available 32 bit verison of MKL, VS2010, Windowx XP, Intel Core2 Duo E8500 @ 3.16GHz with
4 GB RAM.

[cpp] cDblMatrix A(3, 3);
 A.init() = 1, 2, 3,
            4, 5, 6,
            7, 8, 9;
 cDblMatrix VT(3, 3);
 cDblMatrix W(3, 1);
 cDblMatrix U(3, 3);
 cStopWatch sw; // uses QueryPerformanceCounter()
 for (uint32 i = 0; i < 1000000; ++i)
 {
   IppStatus status = ippsSVD_64f_D2(&A(0, 0),
     &U(0, 0),
     A.getRowCount(),
     &W(0, 0),
     &VT(0, 0),
     A.getColumnCount(),
     A.getColumnCount(),
     1000);
 }
 cTracer::cout << sw.elapsed() << TREND; //3.46376
 sw.start();
 float64 superb[100];
 for (uint32 i = 0; i < 1000000; ++i)
 {
   cMatrix ATemp(A);
   int M = A.getColumnCount();
   int N = A.getRowCount();
   int lda = A.getColumnCount();
   int ldu = A.getColumnCount();
   int ldvt = A.getRowCount();
   LAPACKE_dgesvd(LAPACK_ROW_MAJOR, 'S', 'S', M, N, (float64*)&ATemp(0, 0), lda, &W(0, 0), &U(0, 0), ldu, &VT(0, 0), ldvt, superb);
 }
 cTracer::cout << sw.elapsed() << TREND; //7.97314[/cpp]
Unfortunately the LAPCK version make it necessary to copy the matrix because it overwrites the input matrix A.
0 Kudos
Reply