Slow MKL

sveina · ‎07-14-2007

Hi,

I tried to submit the message below yesterday, can't see it in the forum, so here is another try.

I have just bought the IVF10.0 with MKL and are comparing the performance of the MKL Lapack with Netlib Lapack. I have experimented with generalized eigenvalues, DGGEVX, 500x500 system, and a linear system, DGESVX, 2000x2000.

For DGESVX the execution times are: Netlib 24.5s; MKL 3.7s; Netlib Lapack, MKL BLAS 4.55s.

For DGGEVX the execution times are: Netlib 13.5s; MKL 13.6s; Netlib Lapack, MKL BLAS 13.4s.

What is the problem in the case of DGGEVX? I used different matrix sizes and the picture did'nt change.

By the way, I used Netlib Lapack 3.0. When I used Lapack 3.1 there was a problem in DGGEVX, all the eigenvalues were infinite 1E308. Has anyone else experienced problems with version 3.1?

My system is XP, Intel 4.

Svein-Atle Engeseth

Intel_C_Intel · ‎07-18-2007

Well, the simple answer is that the linear solver has been optimized, as seen by the performance numbers you show, and the eigensolver hasn't, so you are running the same code in both cases. The only differences are in the compiler switches and, of course, to whatever degree MKL may or may not be used by the solver.

We have been going through all of the LAPACK routines to address performance and threading issues, and if you look across the spectrum of the routines for MKL 9.1, you would see numerous improvements over MKL 9.0. The next release will have a number of additional performance enhancements for LAPACK, but this function will not be one of those whose performance will be improved. Longer term we plan to address the performance for DGGEVX, also, but not yet.

Bruce

sveina · ‎07-20-2007

Hi,

Thanks for the answer. I thought the whole Lapack was optimized. I don't know the details of the subroutines in Lapack but I thought that they should use the BLAS and especially BLAS3. So even if the Lapack routine itself is not optimized one should gain from having optimized BLAS available.

Does the documentation say which routines have been optimized?

Svein