Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Scipy Crash with MKL

John_D_
Beginner
916 Views

hi guys

I'm having Scipy crash with MKL on Rhel 6.5.

Here is my configuration when i build Scipy:

  • CFLAGS : -m64
  • LDFLAGS : -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm

The Numpy site.cfg has this configuration:

mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core, mkl_def'
lapack_libs = 'mkl_lapack95_lp64'

I saw that a bug (segmentation fault) has been fixed with the last release of MKL

DPD200405571 CDOTU produces segmentation fault when complied by gfortran linked with -lmkl_rt

Unfortunably, it's only with the mkl_rt. As you noticed, I'm still having the bug with the non-rt linbrary.

I also tried to add FFLAGS as suggested here:

https://software.intel.com/en-us/articles/mkl-single-dynamic-library-libmkl-rtso-does-not-conform-to-the-gfortran-calling-convention

But this was without luck too.

When i lauch the Scipy tests, here's the segmentation fault I have:

python -c "import scipy; scipy.test(verbose=2)"

... # some tests running
test_nonlin.TestJacobianDotSolve.test_anderson ... Segmentation fault

So, what do you think ? I have tried with MKL 11.1 on Rhel 6.5 and MKL 10.3 update 3 on Debian.

Current configuration on RHEL is:

  • gcc/gfortran version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
  • Red Hat Enterprise Linux Server release 6.5 (Santiago)
  • Using BLAS from the Red Hat installation

I can't install the MKL 11.1.3 (last version) for evaluation because the license key is not sent to me. 

Anyway, How can i circumvent this ?

Is there a know solution, or a know issue regarding this crash ?

Thanks !

 

 

0 Kudos
16 Replies
Chao_Y_Intel
Moderator
916 Views

Hello,

For the linkage line, actually the last "mkl_def" is not needed:
mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core, mkl_def'

It needs to change to:
mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core'

Also is this problem happens with one particular MKL functions, or all function has not work yet?

Thanks,
Chao

 

 

 

0 Kudos
John_D_
Beginner
916 Views

Hello Chao

Thanks for the recommendation, I will remove the mkl_def linkage in Scipy.

Regarding your question, this always happens at on the same test, in the same function. 

Here is the stack trace with gdb attached to Python:

(gdb) where
#0  0x00007fffec60c208 in L_not_one_step_loopgas_1 () from /.../mkl/lib/intel64/libmkl_def.so
#1  0x00000000006010a0 in ?? ()
#2  0x00000000014c0d00 in ?? ()
#3  0x00000000013d7df0 in ?? ()
#4  0x00000000013d7df0 in ?? ()
#5  0x00000000006010a0 in ?? ()
#6  0x00007ffff073600e in zdotc_gf () from /.../mkl/lib/intel64/libmkl_gf_lp64.so
#7  0x00007ffff073631e in zdotc_ () from /.../mkl/lib/intel64/libmkl_gf_lp64.so
#8  0x0000000000000000 in ?? ()

 

As you can see, it seems zdtoc is the cause of the problem.

I'm using mkl 11.1.0.08 on Rhel and tried mkl 10.3.3.174 on Debian 5 with the same results.

Thanks

0 Kudos
Chao_Y_Intel
Moderator
916 Views

Hi, 
I see the error happens with 
/.../mkl/lib/intel64/libmkl_def.so

libmkl_def.so is default optimized code for the old processors.  If you have recent processor, it should not expect to go to this path. 

Please try to remove the libmkl_def.so in the command line, and see how it works for the code. 

Thanks,
Chao 

 

0 Kudos
John_D_
Beginner
916 Views

hi Chao,

I have removed the mkl_def from the Numpy site.cfg.

When I execute the command:

python -c "import scipy; scipy.test(verbose=2)"

I have the following error:

Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so.

I have checked that my LD_LIBRARY_PATH is correctly set. Which library do I need to include? I tried with mkl_avx but that does not work for Scipy. 

Thanks

 

 

 

0 Kudos
VipinKumar_E_Intel
916 Views

Hi John,

   If you are building NumPy/SciPy with Gnu compiler chain, you must link with mkl_rt and no other linking will work.

Please change the site.cfg as mentioned in the article https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl

[mkl]
library_dirs = <mkl installation folder>/lib/intel64
include_dirs = <mkl installation folder>/include
mkl_libs = mkl_rt
lapack_libs =

Since you are using gnu OpenMP, set the MKL_THREADING_LAYER=GNU env variable..

Also, since Scipy and NumPy also has BLAS functions, using both is not supported by MKL and can cause seg faults. So, when using SciPy BLAS functions, you should set MKL_INERFACE_LAYER=GNU.

By the way, the above env. variables are only part of the MKL 11.1 update 3.  As we see, you are yet to have a MKL 11.1.3 installed, can you pleas submit an issue for the eval license in premier.intel.com?

--Vipin


 

0 Kudos
John_D_
Beginner
916 Views

hi Vipin

Thanks for these configuration tips. I have updated my configuration accordingly.

So now, I'm falling back to the segmentation fault issue which have been corrected in MKL 11 update 3 (DPD200405571).

I have tried to report the license issue in premier.intel.com, but the interface seems broken. I can't select any product nor browse them.

Could you report the issue? Or is it possible to have a trial license for update 3, so I can validate the problem is solved the machines I use.

Thanks in advance

0 Kudos
Bernard
Valued Contributor I
916 Views

>>>As you can see, it seems zdtoc is the cause of the problem>>>

I think that the crash is caused by this function call #0  0x00007fffec60c208 in L_not_one_step_loopgas_1 () from /.../mkl/lib/intel64/libmkl_def.so

 

 

0 Kudos
Bernard
Valued Contributor I
916 Views

@John D

Do you know IP of the crash?

0 Kudos
John_D_
Beginner
916 Views

Hi iliyapolak

Yes, I have the IP of the machine. Although I'm not sure I would be allowed to communicate it publicly. Would that help ?

Thanks

 

 

0 Kudos
mecej4
Honored Contributor III
916 Views

I think Iliya has in mind Instruction Pointer (the address at which the crash occurred) rather than Internet Protocol or Intellectual Property!

Too many acronyms to disambiguate?

0 Kudos
John_D_
Beginner
916 Views

hi guys

It would be easier if I could try with the new MKL 11.1.3 or above, which fixes the issue. Is is possible to get an evaluation version? The registration from the intel website seems to be broken, as the evaluation license file is never sent.

Thanks

@mecej4 : I guess so!! I don't know where my was my mind .. :)

 

 

0 Kudos
Zhang_Z_Intel
Employee
916 Views

@John D.: Intel no longer provides an evaluation license for standalone MKL. You'll have to get it through the Intel Parallel Studio XE suite. Check out this "Try & Buy" page: https://software.intel.com/en-us/intel-parallel-studio-xe/try-buy

0 Kudos
John_D_
Beginner
916 Views

@Zhang : thanks. Actually I tried with Parallel Studio as well, but it ends being the same as I never receive the evaluation license file.

0 Kudos
Bernard
Valued Contributor I
916 Views

John D. wrote:

Hi iliyapolak

Yes, I have the IP of the machine. Although I'm not sure I would be allowed to communicate it publicly. Would that help ?

Thanks

 

 

 

Sorry for late answer.

Yes of course I meant Instruction Pointer of the crash.

0 Kudos
John_D_
Beginner
916 Views

hi guys

Here's my update on this crash:

I have tested with MKL 11.2.1 and the crash still exists. It happens with mkl_rt library.

I can't install MKL other than with the use of mkl_rt, as the MKL Link Line Advisor and tools   do not give correct informations (either partial or invalid) for installation.

Also, different options are returned for the same configurations when you use the LLA on the website ( https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor),or the tools provided in the package.

I'm giving up for now, as no clear solution seems to exist regarding this segmentation fault.

cheers

0 Kudos
TimP
Honored Contributor III
916 Views

 

How are we to know if you followed any of the usual checklists on segfaults? Stack limit settings usually come near the top. Both shell stack limit and omp_stacksize.  

0 Kudos
Reply