Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Segmentation fault in MKL_get_N_Cores

bruce1661
Beginner
792 Views
As part of a 2D spline interpolation routine, I'm calling dgesv(). That routine is giving me a segmentation fault in MKL_get_N_Cores(). The debugger output is:

Dump of assembler code for function MKL_get_N_Cores:
0x0063f190 <+0>: push %ebx
0x0063f191 <+1>: push %esi
0x0063f192 <+2>: push %edi
0x0063f193 <+3>: push %ebp
0x0063f194 <+4>: sub $0x4ecc,%esp
0x0063f19a <+10>: call 0x63f19f
0x0063f19f <+15>: pop %edi
0x0063f1a0 <+16>: lea 0x4671c9(%edi),%edi
0x0063f1a6 <+22>: cmpl $0x1,0xa1f74(%edi)
0x0063f1ad <+29>: je 0x63f1c1
0x0063f1af <+31>: mov %edi,%ebx
0x0063f1b1 <+33>: call 0x63b820
0x0063f1b6 <+38>: mov %eax,%esi
0x0063f1b8 <+40>: cmpl $0xffffffff,0x3658(%edi)
0x0063f1bf <+47>: je 0x63f1cc
0x0063f1c1 <+49>: add $0x4ecc,%esp
0x0063f1c7 <+55>: pop %ebp
0x0063f1c8 <+56>: pop %edi
0x0063f1c9 <+57>: pop %esi
0x0063f1ca <+58>: pop %ebx
0x0063f1cb <+59>: ret

The crash occurs at the line: call . This is running on Ubuntu with g++ compiler. It was using the static libraries. I switched to the dynamic libraries and got the same fault. The 2D spline code is included in another app. I fed the same input file to the 2nd app and it works fine. I verified with the debugger that the arguments to dgesv with the two apps were identical. The app that crashes uses about 1.2GB of RAM while the app that doesn't crash uses about 100MB.

Any idea what's causing this? Or suggestions for a workaround.

Bruce
0 Kudos
6 Replies
Sridevi_A_Intel
Employee
792 Views
Dear Bruce,

Could you please give me a testcase so that I can reproduce it and figure out and dig more into what could be the problem?

Thanks,
Sridevi
0 Kudos
bruce1661
Beginner
792 Views
I'll try. I thought of another difference between the 2 apps. The app that fails is using real-time extensions (Xenomai). And the app that works is not. I'll write a small test app. If it works, I'll add a few Xenomai calls and see if it fails.

This may take a few days.

Bruce
0 Kudos
barragan_villanueva_
Valued Contributor I
792 Views
Bruce,

Thanks for your time and efforts to create small testcase to reproduce the problem.

Just my guess however, MKL_get_N_Cores function tries to recognize CPU topology, but if Xenomai framework changesit via shadowing some CPU parameters (for example, CPU affinity) then MKLmight be confused somehow but must not crashed anyway.
0 Kudos
bruce1661
Beginner
792 Views
Attached are 2 test cases. The one built as a linux program (mkltest) works. The xenomai version (mklxentest) fails with the same fault.

I took the dgesv example and turned it into a function. In the xenomai version, main() makes a couple of xenomai calls to create and run it as a task.

Bruce
0 Kudos
barragan_villanueva_
Valued Contributor I
792 Views
Bruce,

Nothing was attached in your previous post. Please try again.

Also, it would be helpful to add some description how to run your tests. E.g. how tocreate xenomai environment and run the second test
0 Kudos
bruce1661
Beginner
792 Views
Found the answer. When a xenomai task is created, you tell it the amount of stack space to allocate. From the documentation "The size of the stack (in bytes) for the new task. If zero is passed, a reasonable pre-defined size will be substituted." We were passing 0. When I increased the stack size to 1MB, then the MKL calls didn't crash.

As far as I can tell the files are there. Not sure what I need to do so you can access the files.
0 Kudos
Reply