Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Possible dgetrf IPIV issue

Thomas_M_9
Beginner
1,048 Views

Hello, I am attempting to use dgetrf to get an LU factorization of a square matrix as part of a large mex program. When I check the output of dgetrf, I find the IPIV contains both a 0 and a number which is size of the matrix. I checked the documentation and it says the zero should not be there.

I have been able to reproduce this error in a smaller test case:

The C script (test_case.c)

#include "mex.h"
#include <mkl.h>
#include <math.h>
#include <stdbool.h>
#include <string.h>
#include <matrix.h>

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray * prhs[]) {


   int N = (int) mxGetScalar(prhs[0]);
   int M = (int) mxGetScalar(prhs[1]);

   double * A = (double*) malloc(sizeof(double) * N * N);
   memcpy(A, mxGetPr(prhs[2]), sizeof(double) * N * N);

   double * B = (double*) malloc(sizeof(double) * N * M);
   memcpy(B, mxGetPr(prhs[3]),sizeof(double) * N * M);

   mexPrintf("\nA:\n");
   for (int i = 0; i < N; i++) {
     for (int j = 0; j < N; j++) {
       mexPrintf("%f ", A[j * N + i]);
     }
     mexPrintf("\n");
   }

   mexPrintf("\nB:\n");
   for (int i = 0; i < N; i++) {
     for (int j = 0; j < M; j++) {
       mexPrintf("%f ", B[j * N + i]);
     }
     mexPrintf("\n");
   }

   lapack_int * pivots = (lapack_int*)malloc(sizeof(lapack_int)*N);

   int info = LAPACKE_dgetrf(CblasColMajor,N,N,A,N,pivots); //Turn HPH   into is LU decomp
   if (info != 0) {
     fprintf(stderr,"info = %d\n",info);
     mexErrMsgTxt("ERROR:smooth_history_simple-LAPACKE_dgetrf failed");
   }

   mexPrintf("\nLU_A\n");
   for (int i = 0; i < N; i++) {
     for (int j = 0; j < N; j++) {
       mexPrintf("%f ", A[i + j * N]);
     }
     mexPrintf("\n");
   }

   mexPrintf("ipiv:\n");
   for (int i = 0; i < N; i++) {
     mexPrintf("%d ", pivots);
   }

   LAPACKE_dgetrs(CblasColMajor,CblasTrans,N,M,A,N,pivots,B,M);

   mexPrintf("Solution:\n");
   for (int i = 0; i < N ; i++) {
     for (int j = 0; j < M; j++) {
       mexPrintf("%f ", B[i + N * j]);
     }
     mexPrintf("\n");
   }
}

My call to the Mex compiler is

mex CC="icc" CFLAGS="-D_GNU_SOURCE  -fexceptions -fPIC -fno-omit-frame-pointer -pthread -std=c99 -mkl" -v -largeArrayDims -I/opt/intel/mkl/include -L/opt/intel/mkl/lib/intel64 -lmkl_core -lmkl_intel_ilp64 -lmkl_intel_thread -lpthread -lm -liomp5 test_case.c

Then running the Mex script in matlab, I get

>> A = [0,5,5;2,9,0;6,8,8]

    A =

     0     5     5
     2     9     0
     6     8     8

    >> B = [1,0,0;0,1,0;0,0,1]


    B =

     1     0     0
     0     1     0
     0     0     1

>> test_case(3,3,A,B)
test_case(3,3,A,B)

A:
0.000000 5.000000 5.000000 
2.000000 9.000000 0.000000 
6.000000 8.000000 8.000000 

B:
1.000000 0.000000 0.000000 
0.000000 1.000000 0.000000 
0.000000 0.000000 1.000000 

LU_A
6.000000 8.000000 8.000000 
0.333333 6.333333 -2.666667 
0.000000 0.789474 7.105263 
ipiv:
3 0 2 
------------------------------------------------------------------------
       Segmentation violation detected at Mon Jul  6 14:48:38 2015
------------------------------------------------------------------------

Configuration:
  Crash Decoding     : Disabled
  Current Visual     : 0x25 (class 4, depth 24)
  Default Encoding   : US-ASCII
  GNU C Library      : 2.5 stable
  MATLAB Architecture: glnxa64
  MATLAB Root        : /opt/MATLAB/R2013a
  MATLAB Version     : 8.1.0.604 (R2013a)
  Operating System   : Linux 2.6.18-404.el5 #1 SMP Sat Mar 7 04:14:13 EST 2015 x86_64
  Processor ID       : x86 Family 6 Model 45 Stepping 2, GenuineIntel
  Virtual Machine    : Java 1.6.0_17-b04 with Sun Microsystems Inc. Java HotSpot(TM) 64-Bit Server VM mixed mode
  Window System      : Hummingbird - Open Text (13850), display localhost:35.0

Fault Count: 1


Abnormal termination:
Segmentation violation

Register State (from fault):
  RAX = 0000000000000001  RBX = 0000000000000066
  RCX = 00015858106d7848  RDX = 0000000000000003
  RSP = 00002b0b75e504c8  RBP = 0000000000000070
  RSI = 0000000000000003  RDI = 0000000000000000

   R8 = 00002b0b00000003   R9 = 0000000000000003
  R10 = 00002b0b94f288e0  R11 = 0000000000000000
  R12 = 0000000000000003  R13 = 0000000000000003
  R14 = 00000000106d77e0  R15 = 0000000000000003

  RIP = 00002b0b9c2f1aa3  EFL = 0000000000010206

   CS = 0033   FS = 0000   GS = 0000

Stack Trace (from fault):
[  0] 0x00002b0b9c2f1aa3 /opt/intel/composer_xe_2013.2.146/mkl/lib/intel64/libmkl_intel_ilp64.so+03218083 LAPACKE_dge_nancheck+00000035
[  1] 0x00002b0b9c1eccdd /opt/intel/composer_xe_2013.2.146/mkl/lib/intel64/libmkl_intel_ilp64.so+02149597 LAPACKE_dgetrs+00000141
[  2] 0x00002b0b9ab62c4e /lcl/mq/home/m1tjm01/projects/Hess/Kalman_Filter/test_case.mexa64+00007246 mexFunction+00000670
[  3] 0x00002b0b6cdeff8a           /opt/MATLAB/R2013a/bin/glnxa64/libmex.so+00110474 mexRunMexFile+00000090
[  4] 0x00002b0b6cdec0f9           /opt/MATLAB/R2013a/bin/glnxa64/libmex.so+00094457
[  5] 0x00002b0b6cdecf1c           /opt/MATLAB/R2013a/bin/glnxa64/libmex.so+00098076
[  6] 0x00002b0b646e36b2 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_dispatcher.so+00562866 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+00000594
[  7] 0x00002b0b64b33256 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+02245206
[  8] 0x00002b0b64abf09d /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01769629
[  9] 0x00002b0b64ae7b0e /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01936142
[ 10] 0x00002b0b64ae4993 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01923475
[ 11] 0x00002b0b64ae5797 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01927063
[ 12] 0x00002b0b64b50e50 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+02367056
[ 13] 0x00002b0b646e36b2 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_dispatcher.so+00562866 _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+00000594
[ 14] 0x00002b0b64b1fdcb /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+02166219
[ 15] 0x00002b0b64add7cc /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01894348
[ 16] 0x00002b0b64ad9e1d /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01879581
[ 17] 0x00002b0b64ada255 /opt/MATLAB/R2013a/bin/glnxa64/libmwm_interpreter.so+01880661
[ 18] 0x00002b0b6cbbdfae      /opt/MATLAB/R2013a/bin/glnxa64/libmwbridge.so+00139182
[ 19] 0x00002b0b6cbbe111      /opt/MATLAB/R2013a/bin/glnxa64/libmwbridge.so+00139537
[ 20] 0x00002b0b6cbbece5      /opt/MATLAB/R2013a/bin/glnxa64/libmwbridge.so+00142565 _Z8mnParserv+00000725
[ 21] 0x00002b0b644183d2         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00447442 _ZN11mcrInstance30mnParser_on_interpreter_threadEv+00000034
[ 22] 0x00002b0b643f79ac         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00313772
[ 23] 0x00002b0b643f7b88         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00314248
[ 24] 0x00002b0b6f5f15c6         /opt/MATLAB/R2013a/bin/glnxa64/libmwuix.so+00480710
[ 25] 0x00002b0b6f5fedf2         /opt/MATLAB/R2013a/bin/glnxa64/libmwuix.so+00536050
[ 26] 0x00002b0b63d68862    /opt/MATLAB/R2013a/bin/glnxa64/libmwservices.so+01845346
[ 27] 0x00002b0b63d6950f    /opt/MATLAB/R2013a/bin/glnxa64/libmwservices.so+01848591 _Z25svWS_ProcessPendingEventsiib+00001615
[ 28] 0x00002b0b643f85ef         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00316911
[ 29] 0x00002b0b643f8f5c         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00319324
[ 30] 0x00002b0b643f2592         /opt/MATLAB/R2013a/bin/glnxa64/libmwmcr.so+00292242
[ 31] 0x000000347dc0683d                             /lib64/libpthread.so.0+00026685
[ 32] 0x000000347d0d4fcd                                   /lib64/libc.so.6+00872397 clone+00000109


This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.

If this problem is reproducible, please submit a Service Request via:
    http://www.mathworks.com/support/contact_us/

A technical support engineer might contact you with further information.

Thank you for your help.** This crash report has been saved to disk as /mq/home/m1tjm01/matlab_crash_dump.21353-1 **



MATLAB is exiting because of fatal error

From the output, it doesn't appear to be a MATLAB or Mex error. Instead, we see that IPIV produces both a 0 and a 3 where the matrix is 3x3. Does anyone know what is going on here?

0 Kudos
4 Replies
mecej4
Honored Contributor III
1,048 Views

Debugging MEX files is hampered by the fact that MKL error messages to standard error/output are swallowed up by Matlab unless you make special provisions to capture and display such messages.

The second argument to LAPACKE_dgetrs, the "trans" argument, should be a character constant with a value of 'N', 'T' or 'C'. The value that you pass, CblasTrans, is defined in mkl_cblas.h to be 112, which is seen as 'p'. If you had run a C-only version of your example code, you would have seen the error message "Intel MKL ERROR: Parameter 1 was incorrect on entry to DGETRS."

After I replaced the second argument on line-57 to 'T', I built and ran the Mex file on Windows7-X64 using Matlab R2007b-X64 and Intel C 2015. The test script ran without error messages, but I did not check the output results.

The reason that CblasTrans was chosen to be an enum with a value of 112 is probably related to the different parentages of C-BLAS and LAPACK-E. I am slightly curious to know what led you to use CblasTrans as the second argument.

0 Kudos
Thomas_M_9
Beginner
1,048 Views

Thank you for your feed back. I have corrected that issue with the "trans" argument. I think I went with the C-BLAS option because of the fact that there are a bunch of C-BLAS commands prior and I didn't read the doc closely enough.

However, I am still having the same problem as before. As a sanity check, I removed all the MEX parts from test_case.c, added a main, and compiled with icc. That worked fine. However, I went back to MEX and the problem was still there. So my conclusion is that it is a MEX or MATLAB issue. I will have to investigate further.

0 Kudos
mecej4
Honored Contributor III
1,048 Views

I checked the MEX file with ML R2015a-64 bit on Windows 7, as well. It worked and gave the correct results:

ipiv: 
3 2 3 

Solution:
-0.266667 0.059259 0.140741 
0.000000 0.111111 -0.111111 
0.166667 -0.037037 0.037037 

If you pre-multiply the "Solution" by AT, you will get the identity matrix, showing that the solution is correct.

I do not know how much experience you have building Mex files, but there are a few things to be aware of.

Matlab mex files can be built in a number of ways, and Matlab write-protects a mex file that it has just used. Thus, if you run a mex file from Matlab and then attempt to rebuild the mex file using a script without quitting Matlab, the write-protection causes the build to fail. Depending on how the build is done, you may not even see a message to indicate the failure. The net result is that you may be using the stale Mex file in Matlab without being aware of that.

From inside Matlab, you can issue the command

     clear mex

to cause Matlab to clear the lock on the mex file, and then build the new version of the mex file.

0 Kudos
Ying_H_Intel
Employee
1,048 Views

Hi Thomas, 

I can't test, but it seems related to the mode:  ILP64   and LP 64. 

Could you please to try -lmkl_intel_lp64 instead of -lmkl_intel_ilp64 ? 

or if with -lmkl_intel_ilp64, then  in compiler option you need -DMKL_ILP64,  and change the "int " to MKL_INT in test_case.c. 

mex CC="icc" CFLAGS="-D_GNU_SOURCE  -fexceptions -fPIC -fno-omit-frame-pointer -pthread -std=c99 -mkl" -v -largeArrayDims -I/opt/intel/mkl/include -L/opt/intel/mkl/lib/intel64 -lmkl_core -lmkl_intel_ilp64 -lmkl_intel_thread -lpthread -lm -liomp5 test_case.c

Best Regards,

Ying 

0 Kudos
Reply