Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

mkl_ddiamm bug ?


I am trying to multiply two matrices using mkl_ddiamm method:

C = A * B

where A is diagonal matrix 3x3 and B is general matrix 3x3. No matter what I try, i get as a result no A*B, but B*A. This is my sample code. It essentially does SVD decomposition of matrix A and checks, if the computed matrices U, S and VT satisfy all requirements according to theory i.e.

1. U * UT = I, where I us identity matrix

2. V * VT = I

3. U * S * VT = A

Result of temporary operation S * VT is not correct. In fact, the function mkl_ddiamm computes VT * S.

   // requirement: m >= n
   int m = 3;
   int n = 3;
   double *a = (double *)mkl_malloc(m * n * sizeof(double), 16);
   double *s = (double *)mkl_malloc(n * sizeof(double), 16);
   double *u = (double *)mkl_malloc(m * n * sizeof(double), 16);
   double *vt = (double *)mkl_malloc(n * n * sizeof(double), 16);
   double *superb = (double *)mkl_malloc((n-1) * sizeof(double), 16);
   // identity matrix m x m
   double *unit_m = (double *)mkl_malloc(m * m * sizeof(double), 16);

   for (int i = 0; i < m; i++)
      for (int j = 0; j < m; j++)
         unit_m[i*m+j] = i == j ? 1.0 : 0;

   // identity matrix n x n
   double *unit_n = (double *)mkl_malloc(n * n * sizeof(double), 16);

   for (int i = 0; i < n; i++)
      for (int j = 0; j < n; j++)
         unit_n[i*n+j] = i == j ? 1.0 : 0;

   a[0] = 1;   a[1] = 1; a[2] = 1;
   a[3] = 2.5; a[4] = 3; a[5] = 4;
   a[6] = 3;   a[7] = 2; a[8] = 1;

   lapack_int res = LAPACKE_dgesvd(LAPACK_ROW_MAJOR, 'S', 'S', m, n, a, n, s, u, n, vt, n, superb);

   // Checking correctness of SVD calculation ...

   // u * ut = I
   double *temp = (double *)mkl_malloc(m * m * sizeof(double), 16);
   memset(temp, 0, m * m * sizeof(double));
   cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasTrans, m, m, n, 1.0, u, n, u, n, 0, temp, m);

   // v * vt = I
   memset(temp, 0, n * n * sizeof(double));
   cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, n, n, n, 1.0, vt, n, vt, n, 0, temp, n);

   // u * s * vt = a
   memset(temp, 0, n * n * sizeof(double));
   int lval = 3;
   int idiag = 0;
   int ndiag = 1;
   double alpha = 1.0;
   double beta = 0;
   mkl_ddiamm("N", &n, &n, &n, &alpha, "DLNF", s, &lval, &idiag, &ndiag, vt, &n, &beta, temp, &n);
   double *temp2 = (double *)mkl_malloc(m * n * sizeof(double), 16);
   memset(temp2, 0, m * n * sizeof(double));
   cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, n, 1.0, u, n, temp, n, 0, temp2, n);


0 Kudos
4 Replies

Hi Jan, 

It is by designed purpose as MKL support both foran interface and C interface.  So  for these SPARSE BLAS functions,  



This routine supports only one-based indexing of the input arrays.

when A*B,  if  A is claimed  in 1 based,  then  MKL supports only  take column-oriented dense matrix B in 1-base case also. So the routine take B as column-oriented, as a result, it looks VT * S. 

So maybe you can use the cblas  function (c interface) to replace the functionality. 


Computes a matrix-matrix product where one input
matrix is triangular.
void cblas_strmm (const CBLAS_LAYOUT Layout, const CBLAS_SIDE side, const CBLAS_UPLO
uplo, const CBLAS_TRANSPOSE transa, const CBLAS_DIAG diag, const MKL_INT m, const
MKL_INT n, const float alpha, const float *a, const MKL_INT lda, float *b, const
MKL_INT ldb);

Best Regards,



Hello Ying,

now I understand but only partially. So if routine mkl_ddiamm takes matrix B as column oriented, it explains why it seems as if order of multiplication was changed. If my matrix B was saved in column major order, all would be as in documentation. But what has column major order in common with one-based indexing ? Arent these different things ? I am lost in this. I noticed that note about "one-based indexing" in documentation, but didnt understand. There is not any parameter of mkl_ddiamm method like "index of array item" where this note would be appropriate. Perhaps its clear for you, but I am only beginning working with MKL, so forgive my silly questions.


I will use mkl_ddiamm method it seems to me more effective than cblas_?trmm for my case. Instead of multiplicating formula u * s * vt from right u * (s * vt), I will do it from left (u * s) * vt. There should be a little bit more multiplications involved, but the difference isnt significant.


Hi Jan, 

Right, it brings some confusion here. The "one-based indexing" is not apply to  the routine, but for all SP BLAS functions.  We should add some notes there to clarify.  Thanks for asking. 

Please notes,

this routine supports only one-based indexing of the input arrays, thus the B and C ( if suitable) were column-oriented. 


if considering the matrix structure, the mkl_ddiamm should be almost same effect as cblas_?trmm,  But you may caculate  the result either from right or left.   For example.  (u * s) * vt   ,  where (u*s)t=s*ut.  here  two column-oriented make a row major, so the result should be ok. 

Best Regards,