As part of a 2D spline interpolation routine, I'm calling dgesv(). That routine is giving me a segmentation fault in MKL_get_N_Cores(). The debugger output is:
Dump of assembler code for function MKL_get_N_Cores: 0x0063f190 <+0>: push %ebx 0x0063f191 <+1>: push %esi 0x0063f192 <+2>: push %edi 0x0063f193 <+3>: push %ebp 0x0063f194 <+4>: sub $0x4ecc,%esp 0x0063f19a <+10>: call 0x63f19f 0x0063f19f <+15>: pop %edi 0x0063f1a0 <+16>: lea 0x4671c9(%edi),%edi 0x0063f1a6 <+22>: cmpl $0x1,0xa1f74(%edi) 0x0063f1ad <+29>: je 0x63f1c1 0x0063f1af <+31>: mov %edi,%ebx 0x0063f1b1 <+33>: call 0x63b820 0x0063f1b6 <+38>: mov %eax,%esi 0x0063f1b8 <+40>: cmpl $0xffffffff,0x3658(%edi) 0x0063f1bf <+47>: je 0x63f1cc 0x0063f1c1 <+49>: add $0x4ecc,%esp 0x0063f1c7 <+55>: pop %ebp 0x0063f1c8 <+56>: pop %edi 0x0063f1c9 <+57>: pop %esi 0x0063f1ca <+58>: pop %ebx 0x0063f1cb <+59>: ret
The crash occurs at the line: call . This is running on Ubuntu with g++ compiler. It was using the static libraries. I switched to the dynamic libraries and got the same fault. The 2D spline code is included in another app. I fed the same input file to the 2nd app and it works fine. I verified with the debugger that the arguments to dgesv with the two apps were identical. The app that crashes uses about 1.2GB of RAM while the app that doesn't crash uses about 100MB.
Any idea what's causing this? Or suggestions for a workaround.
I'll try. I thought of another difference between the 2 apps. The app that fails is using real-time extensions (Xenomai). And the app that works is not. I'll write a small test app. If it works, I'll add a few Xenomai calls and see if it fails.
Thanks for your time and efforts to create small testcase to reproduce the problem.
Just my guess however, MKL_get_N_Cores function tries to recognize CPU topology, but if Xenomai framework changesit via shadowing some CPU parameters (for example, CPU affinity) then MKLmight be confused somehow but must not crashed anyway.
Found the answer. When a xenomai task is created, you tell it the
amount of stack space to allocate. From the documentation "The size of
the stack (in bytes) for the new task. If zero is passed, a reasonable
pre-defined size will be substituted." We were passing 0. When I
increased the stack size to 1MB, then the MKL calls didn't crash.
As far as I can tell the files are there. Not sure what I need to do so you can access the files.