Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Garnier__Maxime
Beginner
305 Views

Using Intel MKL with Armadillo: segmentation fault

I have been searching for a while now and nothing allowed me to solve my problem.

I installed the Intel MKL libraries on my Ubuntu 18.04 machine and it is correctly linked to Numpy and Scipy. Now, I wanted to do the same thing using Armadillo in the C++ language. I installed it using the readme.txt instructions provided by the Armadillo project (using `cmake`). I checked that it correctly detected the presence of MKL and it did.
Now I want to check that it works well so I just build a matrix and diagonalize it using the following code

    #include <iostream>
    #include <armadillo>

    using namespace std;
    using namespace arma;

    int main()
      {
        wall_clock timer;
        int dim = 100;

        cx_mat C = randu<cx_mat>(dim,dim);
        cx_mat D = C.t()*C;

        vec eigval2;
        cx_mat eigvec2;

        timer.tic(); // Initialize clock

        eig_sym(eigval2, eigvec2, D);
    
        double n = timer.toc();

        cout << "Elapsed time: " << n << " seconds" << endl;
        cout << eigval2 << endl;

      return 0;
      }

which is very basic. The problem is that when I try to run it with a matrix dimension of 500. I get a segmentation error (core dumped) and nothing else. I don't know if this has to do with the linking to MKL or just the Armadillo install. Notice that I don't know how Armadillo "knows" that I want to compile and run using MKL since I also have openBLAS, Lapack and BLAS installed since I just use

 `g++ example.cpp -o example -O2 -larmadillo && ./example`.

I also tried commenting the `#define ARMA_USE_LAPACK` and `#define ARMA_USE_BLAS` in the "include/armadillo_bits/config.hpp" file and rebuilding everything but nothing has changed.

I see a lot of answers pointing to a linking problem with MKL and redirecting to the Intel Link Line Advisor but I have no clue what half of the parameters are nor where I have to implement the necessary changes.

I would appreciate any hints/advice/references to solve that problem.

Thanks in advance,

 

0 Kudos
13 Replies
Gennady_F_Intel
Moderator
305 Views

Maxime, could you please try to set environment variable MKL_VERBOSE( export MKL_VERBOSE=1) and run your executable again. If MKL eigen routine has been called, you will see this info into output log file. 

Garnier__Maxime
Beginner
305 Views

Thanks for the quick answer. Doing that yields

MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 1.70GHz lp64 intel_thread
MKL_VERBOSE ZHERK(U,C,500,500,0x7fffb6753810,0x7f3d148b3040,500,0x7fffb6753820,0x7f3d124e2040,500) 158.61ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16

which I guess means nothing happened on the MKL side.

Gennady_F_Intel
Moderator
305 Views

yes I think so. If mkl prints this output, that means the problem happen later.... 

Garnier__Maxime
Beginner
305 Views

You think that MKL has not even been called by the program or that it has been correctly called and no problem was detected?

Do you have a clue of how to diagnose and cure this? Maybe do the linking manually? I have seen people change one of their makefiles. But I don't know how to do that precisely.

Garnier__Maxime
Beginner
305 Views

A short update: for smaller matrices, there is not segmentation fault and everything works correctly. MKL says 

MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 1.70GHz lp64 intel_thread
MKL_VERBOSE ZHERK(U,C,100,100,0x7ffc12cb6cf0,0x7fd9aad66040,100,0x7ffc12cb6d00,0x7fd9aad3e040,100) 22.54ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16
MKL_VERBOSE ZHEEVD(V,U,100,0x7fd9aacc6040,100,0x55e38ede2e60,0x7fd9aac76040,20400,0x7fd9aac25040,41002,0x55e38ede31c0,1509,0) 57.25ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16

Thanks.

Gennady_F_Intel
Moderator
305 Views

ok, then could you please try to make the sequential call!   export MKL_NUM_THREADS=1 and check how this code would work on your side.

Garnier__Maxime
Beginner
305 Views

Thanks for your answer.

Doing this solves the problem indeed. Should I then be using some OpenMP option to allow the use of all threads?

Gennady_F_Intel
Moderator
305 Views

1) this will affect only on mkl functions only: all mkl's functons would execute in 1 thread. 

2) In the case of C/C++ code, as a temporarily work around the problem  you may  make

mkl_set_num_threads(1);

zheevd(...)

mkl_set_num_threads( number of threads you used before the first call of mkl_set_num_threads);

3) but nevertheless, these symptoms indicate some problem with ?heev but we don't know about some problem with this function in MKL 2018 u3. If that possible, you may try to create C or C++ based reproducer and share with us. We will take  look.

--Gennady

Garnier__Maxime
Beginner
305 Views

I didn't quite understand the reply, sorry.

I tried compiling the code using g++ with the -fopenmp option (with the appropriate installation of Armadillo) and it seemed to work  setting MKL_NUM_THREADS to 2. But the same error happened again with 4.

I can do that. How do you want me to send it? Do you need any other information on the MKL and armadillo configuration?

 

-- 

Maxime

 

 

Gennady_F_Intel
Moderator
305 Views

it seems zgeev seq faulted in the case of many threads (> 1) and problem sizes ~500. We don't see similar problem on our side. how could we check your case?  We need to have C or Fortran code which we may compile and execute on our side to check if the problem exists on our side too.

Garnier__Maxime
Beginner
305 Views

The C++ code I use is the one I attached to my original post. There is also the compilation step.

 

 

Gennady_F_Intel
Moderator
305 Views

the original post contains Armalinio calls which we couldn't manage. we need to have the code without this library calls, just pure C/C++ code.

Garnier__Maxime
Beginner
305 Views

 

Ok but I have no clue of how to do that. Sorry but how will we detect problems if armadillo is not called?

Thanks 

 

Reply