mkl_blas DGEMM routine returns incorrect results

eshamay · ‎08-27-2010

Hopefully this isn't a silly error on my part (most likely), but I've come up with a small code example in order to showcase what I feel is an incorrect result from MKL's BLAS 3 dgemm routine. See below:

1 #include
2 #include
3 #include
4
5
6 void Print (double * m) {
7 for (int i=0;i<3;i++) {
8 for (int j=0;j<3;j++) {
9 printf ("% 5.3f ", m[i+3*j]);
10 }
11 printf ("\\n");
12 }
13 printf ("\\n");
14 }
15
16
17 int main () {
18
19 /* column-major order, as expected for the BLAS routines */
20 double h[9] = {4,3,2,3,4,2,1,2,3};
21 double k[9] = {4,3,2,3,4,2,1,2,3};
22
23 printf ("-----\\nH:\\n");
24 Print(h);
25
26 printf ("-----\\nK:\\n");
27 Print(k);
28
29 cblas_dgemm (CblasColMajor, CblasNoTrans, CblasNoTrans, 3, 3, 3, 1.0, h, 3, k, 3, 0.0, k, 3);
30
31 printf ("-----\\nK after dgemm:\\n");
32 Print(k);
33 return 0;
34 }

I'm running this using MKL 10.2.5.035, using g++ 4.4.1 on a 32-bit i686 Linux box. Other MKL routines work splendidly (i.e. I've performed matrix inversion via LU decomposition successfully with the mkl_lapack routines).

The result of this code is:

-----
H:
4.000 3.000 1.000
3.000 4.000 2.000
2.000 2.000 3.000

-----
K:
4.000 3.000 1.000
3.000 4.000 2.000
2.000 2.000 3.000

-----
K after dgemm:
0.000 0.000 0.000
0.000 0.000 0.000
0.000 0.000 0.000

If I reverse the order of 'h' and 'k' in dgemm such that the call looks like:
29 cblas_dgemm (CblasColMajor, CblasNoTrans, CblasNoTrans, 3, 3, 3, 1.0, k, 3, h, 3, 0.0, k, 3);

the change to the final "K after dgemm" looks like:
-----
K after dgemm:
11.000 167.000 1380.000
16.000 244.000 2016.000
12.000 186.000 1536.000

Both of these results are incorrect. Have I botched the call to the dgemm routine? Is this something more fundamental with the library?

Thanks,
~Eric

mecej4 · ‎08-27-2010

The DGEMM routine takes three matrices A,B,C and two scalars \alpha and \beta as input, computes the expression \alpha op(A).op(B) + \beta C, and returns the result in C.

You have aliased A and C, since you use k in their place. The result is 'implementation dependent'. You should not cause input arguments to clobber one another.

The MKL implementation has this feature. If \beta is zero, as in your example, C need not be set initially, since MKL will notice that \beta is zero, and zero out C before adding \alpha op(A).op(B). However, that has the effect of zeroing your A before it got used, since you made A and C one and the same; that is why the result is a null matrix.

Use another matrix, R, say, in place of the 13th argument, and print that after DGEMM has finished. You will get the correct result.

BLAS routines have been used millions of times over many decades. It is very improbable that errors in them would have remained unnoticed for so long!