Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

SVD multithreading bug in MKL

YoungTaek_Oh
Beginner
582 Views

Hi,

It seems like that mkl has a serious bug in ?gesvd.

First, I tried to find singular values using the Armadillo library with mkl backends, but results are different according to MKL_NUM_THREADS.

for example,

/* -*- mode: c++; -*- */
#include <armadillo>

int main(int argc, char *argv[])
{
    using namespace arma;

    mat M = randu<mat>(957, 957);
    mat U, V;
    vec s;

    svd(U, s, V, M);
    mat newM = U * diagmat(s) * V.t();

    printf("norm: %f\n", arma::norm(M - newM,2));
    return 0;
}

yields the following results

$ MKL_NUM_THREADS=1 ./main
norm: 0.000000
$ MKL_NUM_THREADS=2 ./main
norm: 0.000000
$ MKL_NUM_THREADS=3 ./main
norm: 0.000000
$ MKL_NUM_THREADS=4 ./main
norm: 0.000000
$ MKL_NUM_THREADS=5 ./main
norm: 371.303371
$ MKL_NUM_THREADS=6 ./main
norm: 138.622780
$ MKL_NUM_THREADS=7 ./main
norm: 138.622780

I wrote another program that uses mkl only, but it was worse.

/* -*- mode: c++; -*- */
#include <mkl.h>
#include <string.h>
#include <math.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    lapack_int M = 958;
    double *mat = new double[M * M];
    double *mat_orig = new double[M * M];
    double *s = new double;
    double *diags = new double[M*M];
    double *U = new double[M * M];
    double *Vt = new double[M * M];
    double *superb = new double[M-2];

    for (int i=0; i<M*M; ++i)
        mat = static_cast<double>(rand())/RAND_MAX;

    memcpy(mat_orig, mat, sizeof(double) * M * M);
    LAPACKE_dgesvd(LAPACK_COL_MAJOR, 'a', 'a', M, M, mat, M, s, U, M, Vt, M, superb);

    memset(diags, 0, sizeof(double) * M * M);
    for (int i=0; i<M; ++i)
        diags[i * M + i] = s;

    cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, M, M, 1.0, U, M, diags, M, 0.0, mat, M);
    cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, M, M, M, 1.0, mat, M, Vt, M, 0.0, U, M);

    double res = 0.0;
    for (int i=0; i<M*M; ++i) {
        double t = mat_orig - U;
        res += (t*t);
    }
    printf("norm: %f\n", sqrt(res));

    delete[] superb;
    delete[] Vt;
    delete[] U;
    delete[] diags;
    delete[] s;
    delete[] mat_orig;
    delete[] mat;
    return 0;
}

$ MKL_NUM_THREADS=1 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=2 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=3 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=4 ./main2
norm: 0.000000
$ MKL_NUM_THREADS=5 ./main2
norm: 457.024091
$ MKL_NUM_THREADS=6 ./main2
Segmentation fault (core dumped)
$ MKL_NUM_THREADS=7 ./main2
Segmentation fault (core dumped)

What's wrong with this? I'm using the lastest mkl + intel compiler (11.0 update 3) in ubuntu 12.04 LTS, intel i7 990X machine.

0 Kudos
11 Replies
Gennady_F_Intel
Moderator
582 Views

thanks for the report, we will check the problem on our side.

0 Kudos
Gennady_F_Intel
Moderator
582 Views

how did you link the test - statically or dinamically? is that LP64 or ILP64 interfaces?

0 Kudos
YoungTaek_Oh
Beginner
582 Views

I used -mkl option, such as

icpc main.cpp -o main -mkl -O3

0 Kudos
Gennady_F_Intel
Moderator
582 Views

yes, I see the similar behavioir on my side with the latest 11.0 update 3.

icc -mkl test.cpp. 

# export MKL_NUM_THREADS=1
# a.out
norm: 0.000000

# export MKL_NUM_THREADS=2
# ./a.out
norm: 0.000000

# export MKL_NUM_THREADS=3
# ./a.out
norm: 0.000000

# export MKL_NUM_THREADS=4
# ./a.out
norm: 0.000000
# export MKL_NUM_THREADS=5
# ./a.out
norm: 457.024091
# export MKL_NUM_THREADS=6
# ./a.out
Segmentation fault (core dumped)

We will digg the issue and will keep you updated as much as possible.

0 Kudos
Gennady_F_Intel
Moderator
582 Views

the issue is escalated to the owner of SVD. here is the number of the issue for your reference : DPD200335246

0 Kudos
Gennady_F_Intel
Moderator
582 Views

the fix of the problem available in version 11.0 update4 which released the last week.

please check the problem and let us know the results.

0 Kudos
Gennady_F_Intel
Moderator
582 Views

are there any updates regard to this issue?

0 Kudos
mecej4
Honored Contributor III
582 Views

Gennady,

I am reading this thread rather late, and I find it slightly puzzling. I cannot comment about the Armadillo library, since I don't have it. However, the OP's straight MKL example has a user error:

[cpp]double *superb = new double[M-2];[/cpp]

should be

[cpp]double *superb = new double[M-1];[/cpp]

since the documentation states "On exit, superb(0:min(m,n)-2) contains...". 

This error may, of course, be additional to and independent of the suspected threading bug in MKL but, even with the latest MKL update, it caused seg-faults on Windows because the array is under-allocated..

0 Kudos
Gennady_F_Intel
Moderator
582 Views

mecej, thanks a lot. this is an additional probem. You absolutely right. The test is not correct and we didn't check this example after update 4 has been released. 

0 Kudos
Michael_H_5
Beginner
582 Views

I am still observing this problem 11.0 update 5. Can anybody else confirm this. Is this being worked on?

0 Kudos
Gennady_F_Intel
Moderator
582 Views

Michael, 

This is surprise for us. Do you use the example from the original post from YoungTaek O? if not then pls give us the exact example to check the problem on our side.

It would be useful to see the info about version of MKL and Processor type which have been used.

int version(void) {

  MKLVersion Version;
  mkl_get_version(&Version);
  printf("Major version: %d\n",Version.MajorVersion);
  printf("Minor version: %d\n",Version.MinorVersion);
  printf("Update version: %d\n",Version.UpdateVersion);
  printf("Product status: %s\n",Version.ProductStatus);
  printf("Build: %s\n",Version.Build);
  printf("Platform: %s\n",Version.Platform);
  printf("Processor optimization: %s\n",Version.Processor);
  return 0;
}

0 Kudos
Reply