Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Yung-Chieh_C_
Beginner
97 Views

compiling with GCC using ilp64 interface, keep getting seg-fault

Here is my source file(gm.c) in C, multiplying two n by n matrix, A and B, using dgemm() within Inel MKL:

#include <stdio.h>
#include <stdlib.h>

#include "mkl.h"

int main(){

    int n = 500;

     double *a = (double*)mkl_malloc(sizeof(double) *n*n, 64);
    int i;
    for(i = 0;i<n;i++){
        int j;
        for(j = 0;j<n;j++){
            a[n*i+j] = (double)(rand()%10);
        }
    }
     double *b = (double*)mkl_malloc(sizeof(double)*n*n, 64);
    for(i = 0;i<n;i++){
        int j;
        for(j = 0;j<n;j++){
            b[n*i+j] = (double)(rand()%10);
        }
    }
    double *c = (double*)mkl_malloc(sizeof(double)*n*n, 64);
    for(i = 0;i<n;i++){
        int j;
        for(j = 0;j<n;j++)
            c[n*i+j] = (double)0;
    }

    //cblas_dgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, n, n, n, 1, a, n, b, n, 1, c, n);
    double alpha = 1.0;
    double belta = 0.0;
    dgemm("N", "N", &n, &n, &n, &alpha, a, &n, b, &n, &belta, c, &n);
    //c; row-major
    //C = alpha AB + belta C

    return 0;
}

Also, you can see that I also tried cblas api.

The problem here is that, if I use the following compiling&linking line, though it can pass, the execution ALWAYS get segmentation fault:

gcc -m64 -L$(MKLROOT)/lib/intel64 gm.c $(MKLROOT)/lib/intel64/libmkl_intel_ilp64.so -lmkl_intel_thread  -lmkl_core -liomp5  -lm

this is just the one suggested in that beginning suite of MKL

 

One point is that, if I substitute "ilp64" in the line with "lp64", then it executes well.

 

What can the bug be?

The source code of this library is not open, so I really don't know how to debug;

I've test for a while, and I'm sure the fault occurs when running (cblas_)dgemm()...

I've tried to use things like MKL_INT, MKL_malloc()...etc, but don't help at all !!!

 

So Technical help needed here...

Thank you so much for helping...this is kinda frustrating

0 Kudos
12 Replies
Gennady_F_Intel
Moderator
97 Views

check in mkl_userguide compiling with ILP64 Libraries- use  -DMKL_ILP64 compiler options 

Yung-Chieh_C_
Beginner
97 Views

Thank you very much for reminding me that!!

Actually I've surfed lots of topics here at this forum, seeing many compiling & linking line and options,

and definitely I've seem the -DMKL_ILP64 option;

But I am not sure what this means??

Is -D an GCC compile option, or a linking option??  Because in fact I have problem looking for this option in piles of document...

And what does it want to specify or do?

Sorry for further bothering and asking

Again thank you so much for helping.

Yung-Chieh_C_
Beginner
97 Views

For update....

So I search a little more, and now I've got the point of using -Dmacro, which is #define macro 1;

and I believe this macro is used in code and library building.

But here are still some things:

---> when working with LP64 interface, I didn't define MKL_LP64, but such kind of problem is not arising; why is this?

---> is this saying that the corresponding 32-bit version of the two interface is the default usage? (this is not unreasonable, though)

Is these understanding ok?

Do I miss any point? if so, please tell me

Thank you very much, appreciate your help!

mecej4
Black Belt
97 Views

Yung-Chieh C. wrote:

Is -D an GCC compile option, or a linking option??  Because in fact I have problem looking for this option in piles of document...

The -D "define" option has been used for the pre-processor phase of practically every C compiler since the late 1970s. See, for example, http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Preprocessor-Options.html#Preprocessor-Options .

Yung-Chieh_C_
Beginner
97 Views

Thank you mecej4, finally I can have the official document about this option

Appreciate your help!! 

 

Btw, the doubt about whether the LP64 interface is applied by default still remains questioned...

But I think the answer should be positive, cuz that seems to make fairly sense, at least to me...

Anyone doubting that should assume this...or until some usage/testing result disagree with mine above.

 

TimP
Black Belt
97 Views

lp64 corresponds with default (plain) int. When linking using gcc, you must specify either ilp64 or lp64 library.  There is a simplified -mkl link option for icc, which implies lp64.

Bernard
Black Belt
97 Views

 

>>>The problem here is that, if I use the following compiling&linking line, though it can pass, the execution ALWAYS get segmentation fault:>>>

Do you have an option to execute your program under GDB?

Yung-Chieh_C_
Beginner
97 Views

@Prince: Thanks for offering that information; that would be helpful to my and others' further & future using of MKL !!

@illyapolak: That sounds great!

So, I GDBed the original executable without -DMKL_ILP64; the result may be able give some information about where the seg-fault takes place:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffa7fff700 (LWP 26682)]
0x00007fffce956bcf in mkl_blas_avx_dgemm_mscale ()
   from $(MKLROOT)/lib/intel64/libmkl_avx.so

[backttracking]:

#0  0x00007fffce956bcf in mkl_blas_avx_dgemm_mscale ()
   from /tmp2/b99902074/intel/composer_xe_2013_sp1.0.080/composer_xe_2013_sp1.0.080/mkl/lib/intel64/libmkl_avx.so
#1  0x00007ffff677a987 in gemm_host ()
   from /tmp2/b99902074/intel/composer_xe_2013_sp1.0.080/composer_xe_2013_sp1.0.080/mkl/lib/intel64/libmkl_intel_thread.so
#2  0x00007ffff4e6d603 in L_kmp_invoke_pass_parms ()
   from /tmp2/b99902074/intel/composer_xe_2013_sp1.0.080/composer_xe_2013_sp1.0.080/compiler/lib/intel64/libiomp5.so
#3  0x00007fffffffca60 in ?? ()
#4  0x00007fffffffca68 in ?? ()
#5  0x00007fffffffca70 in ?? ()
#6  0x00007fffffffc7c8 in ?? ()
#7  0x00007fffffffc9a8 in ?? ()
#8  0x00007fffffffc998 in ?? ()
#9  0x00007fffffffc9d8 in ?? ()
#10 0x00007fffffffc990 in ?? ()
#11 0x00007fffa7ffeb10 in ?? ()
#12 0x00007ffff7dead37 in _dl_fixup (l=<optimized out>, reloc_arg=<optimized out>) at ../elf/dl-runtime.c:111
#13 0x00007ffff7df1275 in _dl_runtime_resolve () at ../sysdeps/x86_64/dl-trampoline.S:45
#14 0x00007ffff4e49594 in __kmp_invoke_task_func (gtid=-633748272) at ../../src/kmp_runtime.c:8494
#15 0x00007ffff4e484a1 in __kmp_launch_thread (this_thr=0x410bfde7da39c4d0) at ../../src/kmp_runtime.c:7081
#16 0x00007ffff4e6d996 in __kmp_launch_worker (thr=0x410bfde7da39c4d0) at ../../src/z_Linux_util.c:746
#17 0x00007ffff48d0062 in start_thread (arg=0x7fffa7fff700) at pthread_create.c:312
#18 0x00007ffff4604a3d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

It seems to me that the fault is generated when executing code of that function archived within that shared library...

Noticing that shared library should still be linked once the program is entered, also, I use list command in gdb, and it shows JUST the up&downward five lines FROM INT MAIN{,

my guess is that just when LOADING the library along with the functionfault is caused by not involving -DMK_ILP64 option; 

But this is a wholly guess by me and may need to be confirmed; so just for reference at this point.

Bernard
Black Belt
97 Views

GDB output seems quite unsufficient to effectively pinpoint the problem.Is there any option to display register context,faulting IP and called functions arguments?

I think that segfault occurred in this function call  #0  0x00007fffce956bcf in mkl_blas_avx_dgemm_mscale ().

The parameters probably were passed by calling this function #2  0x00007ffff4e6d603 in L_kmp_invoke_pass_parms ().Now if we could look at the arguments being passed and resolved validity of their addresses.Maybe it could shed some light on the root cause of that segfault.

Yung-Chieh_C_
Beginner
97 Views

Now, here is another testing problem I've found:

In the compiling line above, I link the application with $(MKLROOT)/lib/intel64/libmkl_intel_ilp64.so, the shared library,

with help of -L$(MKLROOT)/lib/intel64 option;

In my knowledge, if -static is not particularly used, then the shared version is priorly linked if existing; and I check this by 

substituting $(MKLROOT)/lib/intel64/libmkl_intel_ilp64.so with -lmkl_intel_ilp64; things go as expected, working fine.

 

Now, when I try to alternatively use the static version of the library, i.e., $(MKLROOT)/lib/intel64/libmkl_intel_ilp64.a, by adding -static option, like this:

-L$(MKLROOT)/lib/intel64  -static -lmkl_intel_ilp64(or specifically: $(MKLROOT)/lib/intel64/libmkl_intel_ilp64.a),

problem arises: the compiling can't even pass, and undefined reference to cblas_dgemm() is used.

 

Is this normal or not? Is this saying that only the shared version of the library can satisfy the use of CBLAS api ??

But if so, why is the static version also generated? This seems somehow unreasonable to me..

I'm wondering if I'm doing something in a wrong way here...anyone recognizing such situation?

Really, so much appreciation and thankfulness for help!!!

Bernard
Black Belt
97 Views

>>>Is this normal or not? Is this saying that only the shared version of the library can satisfy the use of CBLAS api ??

But if so, why is the static version also generated? This seems somehow unreasonable to me..>>>

Sorry I do not have any answer.

If you would like to investigate deeper that segfault and provide more output from GDB I could offer you more help.

TimP
Black Belt
97 Views

When you switch to MKL static libraries, you must follow the specific advice generated by link advisor https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor I doubt that you can mix MKL .a and .so objects. You must take into account the circular references, preferably by the method shown in link advisor.