I am trying to multiply two matrices using mkl_ddiamm method:
C = A * B
where A is diagonal matrix 3x3 and B is general matrix 3x3. No matter what I try, i get as a result no A*B, but B*A. This is my sample code. It essentially does SVD decomposition of matrix A and checks, if the computed matrices U, S and VT satisfy all requirements according to theory i.e.
1. U * UT = I, where I us identity matrix
2. V * VT = I
3. U * S * VT = A
Result of temporary operation S * VT is not correct. In fact, the function mkl_ddiamm computes VT * S.
// requirement: m >= n int m = 3; int n = 3; double *a = (double *)mkl_malloc(m * n * sizeof(double), 16); double *s = (double *)mkl_malloc(n * sizeof(double), 16); double *u = (double *)mkl_malloc(m * n * sizeof(double), 16); double *vt = (double *)mkl_malloc(n * n * sizeof(double), 16); double *superb = (double *)mkl_malloc((n-1) * sizeof(double), 16); // identity matrix m x m double *unit_m = (double *)mkl_malloc(m * m * sizeof(double), 16); for (int i = 0; i < m; i++) for (int j = 0; j < m; j++) unit_m[i*m+j] = i == j ? 1.0 : 0; // identity matrix n x n double *unit_n = (double *)mkl_malloc(n * n * sizeof(double), 16); for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) unit_n[i*n+j] = i == j ? 1.0 : 0; a = 1; a = 1; a = 1; a = 2.5; a = 3; a = 4; a = 3; a = 2; a = 1; lapack_int res = LAPACKE_dgesvd(LAPACK_ROW_MAJOR, 'S', 'S', m, n, a, n, s, u, n, vt, n, superb); // Checking correctness of SVD calculation ... // u * ut = I double *temp = (double *)mkl_malloc(m * m * sizeof(double), 16); memset(temp, 0, m * m * sizeof(double)); cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasTrans, m, m, n, 1.0, u, n, u, n, 0, temp, m); // v * vt = I memset(temp, 0, n * n * sizeof(double)); cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, n, n, n, 1.0, vt, n, vt, n, 0, temp, n); // u * s * vt = a memset(temp, 0, n * n * sizeof(double)); int lval = 3; int idiag = 0; int ndiag = 1; double alpha = 1.0; double beta = 0; mkl_ddiamm("N", &n, &n, &n, &alpha, "DLNF", s, &lval, &idiag, &ndiag, vt, &n, &beta, temp, &n); double *temp2 = (double *)mkl_malloc(m * n * sizeof(double), 16); memset(temp2, 0, m * n * sizeof(double)); cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, n, 1.0, u, n, temp, n, 0, temp2, n);
It is by designed purpose as MKL support both foran interface and C interface. So for these SPARSE BLAS functions,
This routine supports only one-based indexing of the input arrays.
when A*B, if A is claimed in 1 based, then MKL supports only take column-oriented dense matrix B in 1-base case also. So the routine take B as column-oriented, as a result, it looks VT * S.
So maybe you can use the cblas function (c interface) to replace the functionality.
Computes a matrix-matrix product where one input
matrix is triangular.
void cblas_strmm (const CBLAS_LAYOUT Layout, const CBLAS_SIDE side, const CBLAS_UPLO
uplo, const CBLAS_TRANSPOSE transa, const CBLAS_DIAG diag, const MKL_INT m, const
MKL_INT n, const float alpha, const float *a, const MKL_INT lda, float *b, const
now I understand but only partially. So if routine mkl_ddiamm takes matrix B as column oriented, it explains why it seems as if order of multiplication was changed. If my matrix B was saved in column major order, all would be as in documentation. But what has column major order in common with one-based indexing ? Arent these different things ? I am lost in this. I noticed that note about "one-based indexing" in documentation, but didnt understand. There is not any parameter of mkl_ddiamm method like "index of array item" where this note would be appropriate. Perhaps its clear for you, but I am only beginning working with MKL, so forgive my silly questions.
I will use mkl_ddiamm method it seems to me more effective than cblas_?trmm for my case. Instead of multiplicating formula u * s * vt from right u * (s * vt), I will do it from left (u * s) * vt. There should be a little bit more multiplications involved, but the difference isnt significant.
Right, it brings some confusion here. The "one-based indexing" is not apply to the routine, but for all SP BLAS functions. We should add some notes there to clarify. Thanks for asking.
this routine supports only one-based indexing of the input arrays, thus the B and C ( if suitable) were column-oriented.
if considering the matrix structure, the mkl_ddiamm should be almost same effect as cblas_?trmm, But you may caculate the result either from right or left. For example. (u * s) * vt , where (u*s)t=s*ut. here two column-oriented make a row major, so the result should be ok.