I'm having Scipy crash with MKL on Rhel 6.5.
Here is my configuration when i build Scipy:
- CFLAGS : -m64
- LDFLAGS : -lmkl_gf_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm
The Numpy site.cfg has this configuration:
mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core, mkl_def'
lapack_libs = 'mkl_lapack95_lp64'
I saw that a bug (segmentation fault) has been fixed with the last release of MKL
|DPD200405571||CDOTU produces segmentation fault when complied by gfortran linked with -lmkl_rt|
Unfortunably, it's only with the mkl_rt. As you noticed, I'm still having the bug with the non-rt linbrary.
I also tried to add FFLAGS as suggested here:
But this was without luck too.
When i lauch the Scipy tests, here's the segmentation fault I have:
python -c "import scipy; scipy.test(verbose=2)" ... # some tests running test_nonlin.TestJacobianDotSolve.test_anderson ... Segmentation fault
So, what do you think ? I have tried with MKL 11.1 on Rhel 6.5 and MKL 10.3 update 3 on Debian.
Current configuration on RHEL is:
- gcc/gfortran version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC)
- Red Hat Enterprise Linux Server release 6.5 (Santiago)
- Using BLAS from the Red Hat installation
I can't install the MKL 11.1.3 (last version) for evaluation because the license key is not sent to me.
Anyway, How can i circumvent this ?
Is there a know solution, or a know issue regarding this crash ?
For the linkage line, actually the last "mkl_def" is not needed:
mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core, mkl_def'
It needs to change to:
mkl_libs = 'mkl_gf_lp64, mkl_gnu_thread, mkl_core'
Also is this problem happens with one particular MKL functions, or all function has not work yet?
Thanks for the recommendation, I will remove the mkl_def linkage in Scipy.
Regarding your question, this always happens at on the same test, in the same function.
Here is the stack trace with gdb attached to Python:
(gdb) where #0 0x00007fffec60c208 in L_not_one_step_loopgas_1 () from /.../mkl/lib/intel64/libmkl_def.so #1 0x00000000006010a0 in ?? () #2 0x00000000014c0d00 in ?? () #3 0x00000000013d7df0 in ?? () #4 0x00000000013d7df0 in ?? () #5 0x00000000006010a0 in ?? () #6 0x00007ffff073600e in zdotc_gf () from /.../mkl/lib/intel64/libmkl_gf_lp64.so #7 0x00007ffff073631e in zdotc_ () from /.../mkl/lib/intel64/libmkl_gf_lp64.so #8 0x0000000000000000 in ?? ()
As you can see, it seems zdtoc is the cause of the problem.
I'm using mkl 11.1.0.08 on Rhel and tried mkl 10.3.3.174 on Debian 5 with the same results.
I see the error happens with
libmkl_def.so is default optimized code for the old processors. If you have recent processor, it should not expect to go to this path.
Please try to remove the libmkl_def.so in the command line, and see how it works for the code.
I have removed the mkl_def from the Numpy site.cfg.
When I execute the command:
python -c "import scipy; scipy.test(verbose=2)"
I have the following error:
Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so.
I have checked that my LD_LIBRARY_PATH is correctly set. Which library do I need to include? I tried with mkl_avx but that does not work for Scipy.
If you are building NumPy/SciPy with Gnu compiler chain, you must link with mkl_rt and no other linking will work.
Please change the site.cfg as mentioned in the article https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl
[mkl] library_dirs = <mkl installation folder>/lib/intel64 include_dirs = <mkl installation folder>/include mkl_libs = mkl_rt lapack_libs =
Since you are using gnu OpenMP, set the MKL_THREADING_LAYER=GNU env variable..
Also, since Scipy and NumPy also has BLAS functions, using both is not supported by MKL and can cause seg faults. So, when using SciPy BLAS functions, you should set MKL_INERFACE_LAYER=GNU.
By the way, the above env. variables are only part of the MKL 11.1 update 3. As we see, you are yet to have a MKL 11.1.3 installed, can you pleas submit an issue for the eval license in premier.intel.com?
Thanks for these configuration tips. I have updated my configuration accordingly.
So now, I'm falling back to the segmentation fault issue which have been corrected in MKL 11 update 3 (DPD200405571).
I have tried to report the license issue in premier.intel.com, but the interface seems broken. I can't select any product nor browse them.
Could you report the issue? Or is it possible to have a trial license for update 3, so I can validate the problem is solved the machines I use.
Thanks in advance
>>>As you can see, it seems zdtoc is the cause of the problem>>>
I think that the crash is caused by this function call #0 0x00007fffec60c208 in L_not_one_step_loopgas_1 () from /.../mkl/lib/intel64/libmkl_def.so
I think Iliya has in mind Instruction Pointer (the address at which the crash occurred) rather than Internet Protocol or Intellectual Property!
Too many acronyms to disambiguate?
It would be easier if I could try with the new MKL 11.1.3 or above, which fixes the issue. Is is possible to get an evaluation version? The registration from the intel website seems to be broken, as the evaluation license file is never sent.
@mecej4 : I guess so!! I don't know where my was my mind .. :)
@John D.: Intel no longer provides an evaluation license for standalone MKL. You'll have to get it through the Intel Parallel Studio XE suite. Check out this "Try & Buy" page: https://software.intel.com/en-us/intel-parallel-studio-xe/try-buy
John D. wrote:
Yes, I have the IP of the machine. Although I'm not sure I would be allowed to communicate it publicly. Would that help ?
Sorry for late answer.
Yes of course I meant Instruction Pointer of the crash.
Here's my update on this crash:
I have tested with MKL 11.2.1 and the crash still exists. It happens with mkl_rt library.
I can't install MKL other than with the use of mkl_rt, as the MKL Link Line Advisor and tools do not give correct informations (either partial or invalid) for installation.
Also, different options are returned for the same configurations when you use the LLA on the website ( https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor),or the tools provided in the package.
I'm giving up for now, as no clear solution seems to exist regarding this segmentation fault.