Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Seg fault that disappears when compiling in debug mode?

Chris_W_8
Beginner
1,528 Views

Hi. I was hoping I might be able to find some guidance here.

I have a code that uses the intel MKL library. It compiles fine but when I run it, it will either seg fault or produce nonsense results.

However if I compile with debug mode enabled ( that is with the -g flag ) everything seems to work great.

I was wondering if people had any suggestions as to how I might diagnose this problem. Obviously if It try to use Inspector without compiling the the -g flag I don't get the most helpful diagnoses.

I am purposely trying to avoid posting code and having some one tell me the answer. I am more interested in the cause and solution to this kind of problem in general.

Thanks in advance

0 Kudos
7 Replies
Zhang_Z_Intel
Employee
1,528 Views

Do you know where in your code the seg fault happens? Although you said it used MKL, it's unlikely an MKL problem. The '-g' flag itself does not affect how MKL works, because it is a pre-compiled library. The flag only affects other parts of your code. A few things you may want to check:

  • Does the debug mode use a different problem size than the release mode?
  • Is your code threaded? Does the sequential code (using one thread) work fine?
  • Are there any data races in your threaded code? Inspector may help you to identify these.
  • Are you referencing out-of-bound elements of arrays? Again, Inspector may help.

There are so many possibilities. Without looking at your code, that's all help I can provide.

0 Kudos
Chris_W_8
Beginner
1,528 Views

Hmmm. OK. If I run inspector with the -g flag it will tell me that there is "Memory not deallocated" in the cblas_zcopy function?

I assume there is an existing memory problem that is just getting caught at the zcopy?

What exactly does the -g flag do?

Thanks

0 Kudos
Chris_W_8
Beginner
1,528 Views

I just enabled the gdb debugger while running inspector on the code that was compiled without -g.

If I run the backtrace command it says that I am in the zcsradd function?

This is why I am having trouble diagnosing this error. It always seems to be in the mkl functions, though I am sure it is not.

Thanks

0 Kudos
SergeyKostrov
Valued Contributor II
1,528 Views
>>...I am purposely trying to avoid posting code and having some one tell me the answer... Here is a generic answer: An incorrect input parameter or a set of input parameters are passed to some MKL function.
0 Kudos
mecej4
Honored Contributor III
1,528 Views

Chris W. wrote:
I am purposely trying to avoid posting code and having some one tell me the answer. I am more interested in the cause and solution to this kind of problem in general.

There are at least two explanations for seg faults going away when the -g (or /Zi) compiler option is used.

  • Any change in compiler options can change the patterns of memory access and the memory layout of subprogram arguments, local variables and temporaries. If there are uninitialized variables being used in the program, or other subscript errors, it is common for seg-faults to occur depending on which compiler options are used.
  • Heisenbugs are rare, but crop up now and then. If the optimizer is generating incorrect code, it may be necessary to debug at the assembly level to localize and catch such bugs. A reasonable first step is to experiment with different optimization levels and code-generation options. An example of such a bug is one that brought me to these forums: http://software.intel.com/en-us/forums/topic/270147 .
0 Kudos
TimP
Honored Contributor III
1,528 Views

-g changes icc default from -O2 to -O0, which will suppress some bugs such as incorrect data initialization or, as mecej4 pointed out, those which vary with data placement.

It's possible that less stack is used at -O0, leaving MKL more space for allocations.

0 Kudos
Chris_W_8
Beginner
1,528 Views

OK. Thanks guys. I'll take a deeper look at my code.

Hopefully I'll let you know soon what was causing my error.

0 Kudos
Reply