Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Wei_W_6
Beginner
134 Views

dgeev is much slower than matlab eig

I test a random 5000 by 5000 matrix using intel MKL dgeev and matlab separately on the same machine (Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz) and record the CPU cost time for just the eigendecomposition step. When I use icc .... -mkl:parallel, it costs 541s, when I use icc ... -mkl:sequential to compile, it costs 232s.  However, matlab eig just cost 70s.  

Thus I have two questions:

1. why sequential is much faster than parallel?

2. according to matlab, it uses Intel(R) Math Kernel Library Version 11.1.1 to do eigen decomposition, why it is much much faster than dgeev used in my C++ codes.

Can you guys provide me any ideas?  Any suggestion on how to make eigen decomposition faster if I use C++?

0 Kudos
4 Replies
Gennady_F_Intel
Moderator
134 Views

That's an unexpected behavior. For this problem size, threaded mode should be faster vs sequential one. What version of  MKL do you use? Could you give the reproducer to check the problem on our side?

 

Dmitry_B_Intel
Employee
134 Views

Hi Wei W,

Parallel program may take less wall time at the expense of consuming more CPU time.

Also, your C++ program must be correct, because an incorrect program may be very slow just because it does more computation (or it may be very fast too). For a simple test the program can be given a diagonal matrix with all values different.

Thanks
Dima
 

 

 

Calvin_D_R_
New Contributor I
134 Views

It looks to me like you're getting CPU time rather than "wall" time. In parallel operations, many of the timers report the sum of CPU times.

Have you tried MKL second/dsecond?

Wei_W_6
Beginner
134 Views

The MKL I used in my C++ codes is MKL 11.3, containing in Intel Parallel Studio XE 2016 Cluster Edition.  

The C++ codes I wrote is correct since I have checked the results.  I also retested on the wall time mkl sequential, mkl parallel used to compute the same 5000-by-5000 matrix, the results are as follows:

sequential: wall time 240.39s, cpu time 232.11s

parallel: wall time 248.45s, cpu time 467.78s

 

 

Reply