Scalability of dense symmetric indefinite factorization

Rehfeldt__Daniel · ‎02-25-2018

Hi,

I am trying to speed up the factorization of a dense symmetric indefinite matrix, the size of my matrices is usually between 10 k and 20 k.

I am using LAPACK (dsytrf) and MKL 2018 and I run it on a supercomputer node with two Intel Xeon E5-2680 v3 Haswell CPUs (2 x 12 Cores, 2,5 GHz). I also tried a node with Intel Xeon Phi 7250-F Knights Landing CPU and 68 cores, 1.4 GHz. The problem is that the factorization does not seem to scale very well with the number of threads I use: with up to 8 threads I see some improvement (the run time is halfed) but after that there is even a slowdown.

Is this something that is to be expected from this MKL routine? And if so, do you know of any alternative that scales better?

Thanks

Daniel

Ying_H_Intel · ‎02-28-2018

Hi Daniel, Do you mean on both machine, the thread scale is limited to thread 8? it is not expected. we publish some factorization benchmark like dgetrf https://software.intel.com/en-us/mkl/features/benchmarks on xeon and xeon phi. for your reference. And if need, please submit the exact issue to https://supporttickets.intel.com/?lang=en-US with your reproduce matrix. Best Regards, Ying

Rehfeldt__Daniel · ‎03-01-2018

Hi Ying,

thanks for your help. On both machines the factorization does not scale beyond 8 threads.I will submit the matrix to the support, as you suggested.

Best

Daniel

Denis_S_Intel · ‎03-01-2018

Hi Daniel,

We have been working on improving this functionality in terms of performance and scalability, the optimizations will be available in one of the next releases.

Rehfeldt__Daniel · ‎03-02-2018

Hi Denis,

thanks for the information. Do you have an idea on how long this will take (several months, a year, etc)? I am not familiar with your release cycles. Would you recommend to try an LU factorization until then? According to your benchmarks that seem to scale beyond 8 threads.

Denis_S_Intel · ‎03-02-2018

Hi Daniel,

The new release is expected this month. As for LU factorization, yes I think it's a good way to try LU instead of LDLT until the new release is available.
May I ask you what are you going to do with the results once you have them?

Rehfeldt__Daniel · ‎03-02-2018

Hi Denis,

thanks again. Will your enhancements be in the release notes? Otherwise could you let me know once it has been released?

I will try the LU factorization then. Would you expect it to scale to 68 Cores (lets say for a 10 k matrix)?

I use the factorization for solving two to four linear systems with different right hand sides.

Denis_S_Intel · ‎03-06-2018

Hi Daniel,

Yes, the enhancements will be in the release notes and yes, the LU factorization shows good scalability.