Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6623 Discussions

## Why SYTRF/SYTRI is much slower than GETRF/GETRI to compute dense matrix inverse Beginner
152 Views

Dear MKL experts,

My project needs performing the inverse of the complex symmetric dense matrix. I do it using three different pairs of subroutines, GETRF/GETRI, SYTRF/SYTRI, and SYTRF_ROOK/SYTRI_ROOK, in order to pick best one.It is supposed that the subroutines for symmetric matrix are faster than the subroutines for full matrix from the computational math theory, as it is documented in the INTEL programmer reference manual.

GETRF(...)
for real flavors,
If m = n, The approximate number of floating-point operation is (2/3)n3
The number of operations for complex flavors is four times greater, (8/3)n3.
GETRI(...)
The total number of floating-point operations is approximately (4/3)n3 for real flavors and (16/3)n3 for complex flavors.

SYTRF(...)
The total number of floating-point operations is approximately (1/3)n3 for real flavors or (4/3)n3 for complex flavors.
SYTRI(...)
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for complex flavors.

SYTRF_ROOK(...)
[No information of floating-point operations]
SYTRI_ROOK(...)
The total number of floating-point operations is approximately (2/3)n3 for real flavors and (8/3)n3 for complex flavors.

In reality, it is GETRF/GETRI are much faster for larger size dense matrix. My test results are summarized in the below.
Matrix Size     GETRF          SYTRF          SYTRF_ROOK
1000x1000       0.015          0.016           0.047
2000x2000       0.142          0.141           0.281
5000x5000       1.486          1.406           2.595
10000x10000       9.907          9.283          16.282

Matrix Size     GETRI          SYTRI          SYTRI_ROOK
1000x1000       0.064          0.563           0.595
2000x2000       0.312          4.908           4.938
5000x5000       4.437         74.600          74.490
10000x10000      26.972        625.908         615.346

We can learn GETRF/GETRI is 15 to 20 faster than those subroutines for symmetric ones.

Why? Please give me some advice. My test code is attached as the following.

Thanks.  