Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Scalability of dense symmetric indefinite factorization

Rehfeldt__Daniel
Beginner
448 Views

Hi,

I am trying to speed up the factorization of a dense symmetric indefinite matrix, the size of my matrices is usually between 10 k and 20 k.

I am using LAPACK (dsytrf) and MKL 2018 and I run it on a supercomputer node with two Intel Xeon E5-2680 v3 Haswell CPUs (2 x 12 Cores, 2,5 GHz). I also tried a node with  Intel Xeon Phi 7250-F Knights Landing  CPU and 68 cores, 1.4 GHz. The problem is that the factorization does not seem to scale very well with the number of threads I use: with up to 8 threads I see some improvement (the run time is halfed) but after that there is even a slowdown.

Is this something that is to be expected from this MKL routine? And if so, do you know of any alternative that scales better?

 

Thanks

Daniel

0 Kudos
7 Replies
Ying_H_Intel
Employee
448 Views
Hi Daniel, Do you mean on both machine, the thread scale is limited to thread 8? it is not expected. we publish some factorization benchmark like dgetrf https://software.intel.com/en-us/mkl/features/benchmarks on xeon and xeon phi. for your reference. And if need, please submit the exact issue to https://supporttickets.intel.com/?lang=en-US with your reproduce matrix. Best Regards, Ying
0 Kudos
Rehfeldt__Daniel
Beginner
448 Views

Hi Ying,

thanks for your help. On both machines the factorization does not scale beyond 8 threads.I will submit the matrix to the support, as you suggested.

Best

Daniel

0 Kudos
Denis_S_Intel
Employee
448 Views

Hi Daniel,

We have been working on improving this functionality in terms of performance and scalability, the optimizations will be available in one of the next releases.

 

0 Kudos
Rehfeldt__Daniel
Beginner
448 Views

Hi Denis,

thanks for the information. Do you have an idea on how long this will take (several months, a year, etc)? I am not familiar with your release cycles. Would you recommend to try an LU factorization until then? According to your benchmarks that seem to scale beyond 8 threads.

0 Kudos
Denis_S_Intel
Employee
448 Views

Hi Daniel,

The new release is expected this month. As for LU factorization, yes I think it's a good way to try LU instead of LDLT until the new release is available.
May I ask you what are you going to do with the results once you have them? 

0 Kudos
Rehfeldt__Daniel
Beginner
448 Views

Hi Denis,

thanks again. Will your enhancements be in the release notes? Otherwise could you let me know once it has been released?

I will try the LU factorization then. Would you expect it to scale to 68 Cores (lets say for a 10 k matrix)?

I use the factorization for solving two to four linear systems with different right hand sides.

0 Kudos
Denis_S_Intel
Employee
448 Views

Hi Daniel,

Yes, the enhancements will be in the release notes and yes, the LU factorization shows good scalability.

0 Kudos
Reply