Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Wrong results from cblas_sgemm


Here is the code:

float max_val(const float * vec, size_t sz)
    float result = 0.0f;
    for (size_t i = 0; i < sz; ++i)
        float val = abs(vec);
        if (val > result)
            result = val;
    return result;

int main()
    const int M = 64;
    const int N = 50176;
    const int K = 576;
    const float alpha = 1.0;
    const float beta = 0.0;

    float *A, *B, *C;
    A = (float *)mkl_malloc( M*K*sizeof( float ), 32 );
    B = (float *)mkl_malloc( K*N*sizeof( float ), 32 );
    C = (float *)mkl_malloc( M*N*sizeof( float ), 32 );
    for (size_t i = 0; i < M*K; ++i)
        A = 1.0;
    for (size_t i = 0; i < K*N; ++i)
        B = 2.0;
    for (size_t i = 0; i < M*N; ++i)
        C = 1.0;

    cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, M, N, K, alpha, A, K, B,
       N, beta, C, N);
    printf("%f\n", max_val(C,M*N));



icc -I/opt/intel/compilers_and_libraries_2016.0.109/linux/mkl/include -L/opt/intel/compilers_and_libraries_2016.0.109/linux/mkl/lib/intel64_lin test.cpp -o test_cblas -lmkl_rt

The array C should end up being all values of 1152. However, when I run this, I get an output of 1153.

Upon looking closer, it turns out that most of the values in C are 1152 except for a bunch of contiguous chunks that are 1153, or more generally, 1152+(initial_value_of_C_array).

If I do this instead with CblasColMajor (and change the stride values accordingly), everything works fine.

What is going on??

0 Kudos
1 Reply

Hi Jacob,

There was a known {S,D}GEMM issue (beta=0 on Intel AVX2) in MKL 11.3 that was fixed in MKL 11.3.1. Please see MKL 11.3.1 release notes here for more details on the issue:

I verified that your tester fails with MKL 11.3, and switching to MKL 11.3.1 gives correct results.

If you are using MKL 11.3.0 version, could you update your MKL version and run your tester again? You can check your MKL version by setting:

export MKL_VERBOSE=1

Thank you,




0 Kudos