Hopefully this isn't a silly error on my part (most likely), but I've come up with a small code example in order to showcase what I feel is an incorrect result from MKL's BLAS 3 dgemm routine. See below:
I'm running this using MKL 10.2.5.035, using g++ 4.4.1 on a 32-bit i686 Linux box. Other MKL routines work splendidly (i.e. I've performed matrix inversion via LU decomposition successfully with the mkl_lapack routines).
The DGEMM routine takes three matrices A,B,C and two scalars \alpha and \beta as input, computes the expression \alpha op(A).op(B) + \beta C, and returns the result in C.
You have aliased A and C, since you use k in their place. The result is 'implementation dependent'. You should not cause input arguments to clobber one another.
The MKL implementation has this feature. If \beta is zero, as in your example, C need not be set initially, since MKL will notice that \beta is zero, and zero out C before adding \alpha op(A).op(B). However, that has the effect of zeroing your A before it got used, since you made A and C one and the same; that is why the result is a null matrix.
Use another matrix, R, say, in place of the 13th argument, and print that after DGEMM has finished. You will get the correct result.
BLAS routines have been used millions of times over many decades. It is very improbable that errors in them would have remained unnoticed for so long!