- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all
I am using the fftw wrapper of the MKL library in a static library (let's call it mylib.lib) and I link it to a shared object (let's call it app.so). It works fine when I link to app.so from within a c++ executable, but when I link to the same app.so from python, I get a segmentation violation (SIGSEGV) from libmkl_def.so when calling fftwf_execute(..).
Any ideas?
I am using the 10.5.6 version of MKL, on Unix (Ubunto 9.1)
Thank you,
Ita.
I am using the fftw wrapper of the MKL library in a static library (let's call it mylib.lib) and I link it to a shared object (let's call it app.so). It works fine when I link to app.so from within a c++ executable, but when I link to the same app.so from python, I get a segmentation violation (SIGSEGV) from libmkl_def.so when calling fftwf_execute(..).
Any ideas?
I am using the 10.5.6 version of MKL, on Unix (Ubunto 9.1)
Thank you,
Ita.
Link Copied
15 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ita
Intel MKL has a layered structure represented by several shared objects, one of them is libmkl_def.so. The layered structure requires appropriate linking. If not linked properly, application may showissues like the segv that you have described.It would be nice if you posted more details about how do you link your application. Ideally a self-contained example would help a lot.
Another point is thatMKL version 10.2 and later have FFTW3 wrappers integrated, so thatone may call FFTW functions directly.
Could you also check what MKL version do you use indeed, perhaps 10.2.6, because 10.5.6 is a strange number.
Thanks
Dima
Intel MKL has a layered structure represented by several shared objects, one of them is libmkl_def.so. The layered structure requires appropriate linking. If not linked properly, application may showissues like the segv that you have described.It would be nice if you posted more details about how do you link your application. Ideally a self-contained example would help a lot.
Another point is thatMKL version 10.2 and later have FFTW3 wrappers integrated, so thatone may call FFTW functions directly.
Could you also check what MKL version do you use indeed, perhaps 10.2.6, because 10.5.6 is a strange number.
Thanks
Dima
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dima and thank you for your answer.
Thanks again,
Ita.
- The version number is 10.2.6.038. Sorry for the mistake.
- Regarding the linkage details:
- From python I link to a shared object using:
[bash]_libraries['libstaapp.so'] = CDLL('libstaapp.so')[/bash]
- From this shared object I link to a static library called 'stalib.a', in which the MKL functions are called (and the MKL h files are included), and to the following MKL shared objects:
- mkl_lapack
- mkl_intel_ilp64
- mkl_core
- mkl_sequential
- mkl_mc3
- mkl_def
- From python I link to a shared object using:
- I cannot at the moment supply a self-contained example, because I cannot share the source code. I will try to create such an example soon, but it will take me a few days to get into it.
- I do not completely understand your remark about the FFTW3 wrapper, but anyway, I am using the FFTW3 wrapper.
Thanks again,
Ita.
vector SoftmaxPredict::_probVals;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ita, could you please try to relink withmkl_intel_lp64 instead ofmkl_intel_ilp64.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The same error occurs (at the same point), when linking with mkl_intel_lp64
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some more information/thoughts:
- The problem occurs on a computer where the python code is called locally. On two other computers (on which the MKL version is 10.2.5.035) where the python code is called from a web server, it works ok. I suspect there might be a version issue on the computer with the problem, maybe the user mixed .so objects from different versions.
- I suspect the problem might be connected to a 16-bit memory alignment issue, which behaves differently when called from python.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ita,
Suspecting SIMD alignment issue is good idea!Could you hintwhatFFT problem the failure occurs at?
Namely, what are precision, kind, dimension, sizes, placement of the transform that fails?
Getting this information may be easier than making a small project.
Thanks
Dima
Suspecting SIMD alignment issue is good idea!Could you hintwhatFFT problem the failure occurs at?
Namely, what are precision, kind, dimension, sizes, placement of the transform that fails?
Getting this information may be easier than making a small project.
Thanks
Dima
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dima,
Actually I have some more information:
A typical use of the wrapper will look like:
I hope it helps.
Actually I have some more information:
- The problem seems to occur not only when calling from python (it was the case on a specific computer, but I reproduced it without any python code involved, only c++).
- If an instance of the Unix machine is created from an image we have, then on some instances it happens and on some it does not, which makes me think it is a memory leak (could it still be the SIMD alignment...?)
- It turns out that on this image (created by a coworker) the regular installation of the MKL was not called, instead the relevant .h and .so files were copied 'by hand'.
[cpp]#include "fftw3.h" //--- fftw wrapper for the mkl lib //--- Complex data types templatestruct ComplexT2; template <> struct ComplexT2 { typedef fftwf_complex Type; }; template <> struct ComplexT2 { typedef fftw_complex Type; }; template class StaFFT_fn; template <> class StaFFT_fn { public: typedef ComplexT2 ::Type ComplexT; typedef fftwf_plan plan; static void * malloc (size_t n) { return fftwf_malloc (n * sizeof(float)); } static fftwf_plan fft_plan_r2c (int nfft, float *in, ComplexT * spec, unsigned int flag) { return fftwf_plan_dft_r2c_1d (nfft, in, spec, flag); } static fftwf_plan fft_plan_c2r (int nfft, ComplexT * spec, float *in, unsigned int flag) { return fftwf_plan_dft_c2r_1d (nfft, spec, in, flag); } static void execute (const fftwf_plan p) { fftwf_execute (p); } static void free (void *p) { fftwf_free (p); } static void destroy_plan (fftwf_plan p) { fftwf_destroy_plan (p); } }; template <> class StaFFT_fn { public: typedef ComplexT2 ::Type ComplexT; typedef fftw_plan plan; static void * malloc (size_t n) { return fftw_malloc (n * sizeof(double)); } static fftw_plan fft_plan_r2c (int nfft, double *in, ComplexT * spec, unsigned int flag) { return fftw_plan_dft_r2c_1d (nfft, in, spec, flag); } static fftw_plan fft_plan_c2r (int nfft, ComplexT * spec, double *in, unsigned int flag) { return fftw_plan_dft_c2r_1d (nfft, spec, in, flag); } static void execute (const fftw_plan p) { fftw_execute (p); } static void free (void *p) { fftw_free (p); } static void destroy_plan (fftw_plan p) { fftw_destroy_plan (p); } }; template class StaFFT { public: StaFFT () { tdata = 0; spec = 0; _plan_c2r = 0; _plan_r2c = 0; } ~StaFFT () { free (); } typedef typename ComplexT2 ::Type ComplexT; //--- complex data type typedef StaFFT_fn FFT; void alloc (size_t nfft) { _nfft = nfft; int specLen = _nfft/2 + 1; free (); tdata = (T_Float *)FFT::malloc (_nfft); spec = (ComplexT *)FFT::malloc (2*specLen); _plan_r2c = FFT::fft_plan_r2c (_nfft, tdata, spec, FFTW_ESTIMATE); // NOTE: mkl's fftw-wrapper ignores the FFTW_ flag _plan_c2r = FFT::fft_plan_c2r (_nfft, spec, tdata, FFTW_ESTIMATE); } void free (void) { FFT::free (tdata); tdata = 0; FFT::free (spec); spec = 0; if (_plan_c2r != 0) { FFT::destroy_plan (_plan_c2r); _plan_c2r = 0; } if (_plan_r2c != 0) { FFT::destroy_plan (_plan_r2c); _plan_r2c = 0; } } void execute_r2c () {FFT::execute (_plan_r2c);} void execute_c2r () {FFT::execute (_plan_c2r);} //--- public data members T_Float * tdata; //--- time-domain data ComplexT * spec; //--- spec-domain data private: size_t _nfft; typename FFT::plan _plan_r2c; //--- forward plan (fft) typename FFT::plan _plan_c2r; //--- backward plan (ifft) };
[/cpp]
A typical use of the wrapper will look like:
[cpp]StaFFTfft; float *x; StaFFT ::ComplexT *X; fft.alloc (nFft); x = fft.tdata; X = fft.spec; // [ copy some data into vector x ] fft.execute_r2c (); // [ some operations on the spectrum X] fft.execute_c2r (); fft.free ();[/cpp]
I hope it helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ita,
Thank you for the test. I played with it a lot using MKL 10.2.6 on an x86_64 system. I tried both float and double specializations of StaFFT, misaligned memory,variety of sizes, linking with libmkl_intel_lp64/ilp6, and yet I could not reproduce the problem. Neither SEGV, nor memory leaks.
Additionally, Intel MKL contains memory management software that speeds up memory allocations for the library. This memory manager is on by default and can be disabled by setting environment variable MKL_DISABLE_FAST_MM=1. You can find details in the MKL User's Guide. I have tried this control too, and still could not reproduce your issue.
If you manage to reproduce the SEGV with aC++ program, could you look atthe backtrace of the failure? ('gdb a.out' then 'run' then 'bt')
Thanks
Dima.
Thank you for the test. I played with it a lot using MKL 10.2.6 on an x86_64 system. I tried both float and double specializations of StaFFT, misaligned memory,variety of sizes, linking with libmkl_intel_lp64/ilp6, and yet I could not reproduce the problem. Neither SEGV, nor memory leaks.
Additionally, Intel MKL contains memory management software that speeds up memory allocations for the library. This memory manager is on by default and can be disabled by setting environment variable MKL_DISABLE_FAST_MM=1. You can find details in the MKL User's Guide. I have tried this control too, and still could not reproduce your issue.
If you manage to reproduce the SEGV with aC++ program, could you look atthe backtrace of the failure? ('gdb a.out' then 'run' then 'bt')
Thanks
Dima.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dima,
I am pasting here the backtrace from gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4155b0d in var000C () from /usr/lib/libmkl_def.so
(gdb) bt
#0 0x00007ffff4155b0d in var000C () from /usr/lib/libmkl_def.so
#1 0x0000000000947300 in ?? ()
#2 0x000000000093a1c0 in ?? ()
#3 0x000000000093e8e0 in ?? ()
#4 0x00007ffff7df35f5 in ?? () from /lib64/ld-linux-x86-64.so.2
#5 0x00007ffff4166907 in W6_ippsFFTFwd_RToPerm_32f () from /usr/lib/libmkl_def.so
#6 0x00007ffff4130769 in W6_ippsDFTFwd_RToCCS_32f () from /usr/lib/libmkl_def.so
#7 0x00007ffff53220ad in mkl_dft_xipps_fwd_rtocomplex_32f () from /usr/lib/libmkl_mc3.so
#8 0x00007ffff53099d4 in mkl_dft_compute_fwd_s_r2c_1d_o () from /usr/lib/libmkl_mc3.so
#9 0x00007ffff65e9763 in DftiComputeForward_1 () from /usr/lib/libmkl_intel_ilp64.so
#10 0x00007ffff65f15c7 in execute_fo () from /usr/lib/libmkl_intel_ilp64.so
#11 0x00007ffff65efd35 in fftwf_execute () from /usr/lib/libmkl_intel_ilp64.so
#12 0x00007ffff6932f24 in CenterCut::process(InStreamBuf&, InStreamBuf&, OutStreamBuf&) () from /usr/lib/libstaapp.so
#13 0x00007ffff6933863 in StereoSplit::process(InStreamBuf&, InStreamBuf&, OutStreamBuf&, OutStreamBuf&) () from /usr/lib/libstaapp.so
#14 0x00007ffff68f65fd in StereoSplitTransformer::process(InStreamBuf*) () from /usr/lib/libstaapp.so
#15 0x00007ffff68f9b8d in StreamProcNode::process(InStreamBuf*, unsigned long) () from /usr/lib/libstaapp.so
#16 0x00007ffff6901edd in StreamRootsProcessor::process(ChannelBuffer*, unsigned long) () from /usr/lib/libstaapp.so
#17 0x00007ffff68fee7b in StreamProcessor::process(float const*, unsigned long) () from /usr/lib/libstaapp.so
#18 0x0000000000406ef9 in main ()
Thanks,
Ita.
p.s I've installed the 10.2.6 version using the install script, so it is not an issue of mis-installed files or something).
I am pasting here the backtrace from gdb:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff4155b0d in var000C () from /usr/lib/libmkl_def.so
(gdb) bt
#0 0x00007ffff4155b0d in var000C () from /usr/lib/libmkl_def.so
#1 0x0000000000947300 in ?? ()
#2 0x000000000093a1c0 in ?? ()
#3 0x000000000093e8e0 in ?? ()
#4 0x00007ffff7df35f5 in ?? () from /lib64/ld-linux-x86-64.so.2
#5 0x00007ffff4166907 in W6_ippsFFTFwd_RToPerm_32f () from /usr/lib/libmkl_def.so
#6 0x00007ffff4130769 in W6_ippsDFTFwd_RToCCS_32f () from /usr/lib/libmkl_def.so
#7 0x00007ffff53220ad in mkl_dft_xipps_fwd_rtocomplex_32f () from /usr/lib/libmkl_mc3.so
#8 0x00007ffff53099d4 in mkl_dft_compute_fwd_s_r2c_1d_o () from /usr/lib/libmkl_mc3.so
#9 0x00007ffff65e9763 in DftiComputeForward_1 () from /usr/lib/libmkl_intel_ilp64.so
#10 0x00007ffff65f15c7 in execute_fo () from /usr/lib/libmkl_intel_ilp64.so
#11 0x00007ffff65efd35 in fftwf_execute () from /usr/lib/libmkl_intel_ilp64.so
#12 0x00007ffff6932f24 in CenterCut
#13 0x00007ffff6933863 in StereoSplit
#14 0x00007ffff68f65fd in StereoSplitTransformer::process(InStreamBuf
#15 0x00007ffff68f9b8d in StreamProcNode::process(InStreamBuf
#16 0x00007ffff6901edd in StreamRootsProcessor::process(ChannelBuffer*, unsigned long) () from /usr/lib/libstaapp.so
#17 0x00007ffff68fee7b in StreamProcessor::process(float const*, unsigned long) () from /usr/lib/libstaapp.so
#18 0x0000000000406ef9 in main ()
Thanks,
Ita.
p.s I've installed the 10.2.6 version using the install script, so it is not an issue of mis-installed files or something).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Ita,
Looks like the problem comes from libstaapp.so. That lines #6 and #7 come from different .so files should not happen. When you link libstaapp.so, do you really need to link in libmkl_def.so and libmkl_mc3.so? I suggest you drop them from the build oflibstaapp.so.
Thanks
Dima
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
Seems that removing the shared objects you suggested solved the problem for the executable.
The call from the python code still failed, but not on the segmentation violation - some symbols weren't found in runtime. Linking to mkl_mc.so solved it as well, so it seems all is working well right now (I still have to verify it on a few instances etc.)
Btw, how come it worked on some machines/instances and failed on others?
Thank you very much for the support.
Ita.
Seems that removing the shared objects you suggested solved the problem for the executable.
The call from the python code still failed, but not on the segmentation violation - some symbols weren't found in runtime. Linking to mkl_mc.so solved it as well, so it seems all is working well right now (I still have to verify it on a few instances etc.)
Btw, how come it worked on some machines/instances and failed on others?
Thank you very much for the support.
Ita.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
If you link libstaapp.so with MKL's .so libraries, you will likely get the issue anyway, depending on the use case.
If you don't link libstaapp.so with cpu-specific core libraries (libmkl_mc.so and such), the application linked with libstaapp.so still works on all platforms, because loader adds all necessary symbols from MKL libraries (-lmkl_intel_ilp6, -lmkl_sequential, and -lmkl_core) into application's global namespace and libmkl_core manages to pick appropriate function.
However, using this libstaapp.so from python fails because python loads libstaapp.so so that the symbols are not added to the global namespace. You partially solve the issue by additionally linking libstaapp.so to a cpu-specific library and thus fetch some dependent symbols on load, but this fails on the platform where that cpu-specific library is not supported. Adding more cpu-specific libraries confuses the loader, resulting in the strange SEGVs that you've observed.
There are currently two ways to solve the issue (see also discussion in the thread dlopen woes).
1. Instead of linking libstaapp.so to MKL .so libraries, you create your own mkl_custom.so, using tools/builder in MKL distribution.
2. (should work) You add initialization code into libstaapp.so, that brings MKL's .so symbols into global namespace, something like this:
Thanks
Dima
If you link libstaapp.so with MKL's .so libraries, you will likely get the issue anyway, depending on the use case.
If you don't link libstaapp.so with cpu-specific core libraries (libmkl_mc.so and such), the application linked with libstaapp.so still works on all platforms, because loader adds all necessary symbols from MKL libraries (-lmkl_intel_ilp6, -lmkl_sequential, and -lmkl_core) into application's global namespace and libmkl_core manages to pick appropriate function.
However, using this libstaapp.so from python fails because python loads libstaapp.so so that the symbols are not added to the global namespace. You partially solve the issue by additionally linking libstaapp.so to a cpu-specific library and thus fetch some dependent symbols on load, but this fails on the platform where that cpu-specific library is not supported. Adding more cpu-specific libraries confuses the loader, resulting in the strange SEGVs that you've observed.
There are currently two ways to solve the issue (see also discussion in the thread dlopen woes).
1. Instead of linking libstaapp.so to MKL .so libraries, you create your own mkl_custom.so, using tools/builder in MKL distribution.
2. (should work) You add initialization code into libstaapp.so, that brings MKL's .so symbols into global namespace, something like this:
[bash]#include#if defined(__cplusplus) struct OnOpen { void *dlh1, *dlh2; OnOpen() : dlh1(0), dlh2(0) { int flags = RTLD_GLOBAL|RTLD_LAZY; void *dlh1 = dlopen("libmkl_sequential.so",flags); void *dlh2 = dlopen("libmkl_core.so",flags); } ~OnOpen() { if (dlh1) dlclose(dlh1); if (dlh2) dlclose(dlh2); } } ______onOpen; #else /* use __attribute__((constructor)) * and __attribute__((destructor)) */ #endif[/bash]
Thanks
Dima
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
You are completely right, it still fails in some use-cases when called from python, and after reading your explanation I start to understand why.
I will try the solutions you suggested and report.
Thank you again,
Ita.
You are completely right, it still fails in some use-cases when called from python, and after reading your explanation I start to understand why.
I will try the solutions you suggested and report.
Thank you again,
Ita.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi again
after a week or so of tests it seems everything is working ok now. I used the 2nd method you suggested, i.e using an initialization code.
Thanks again for the effort.
Ita.
after a week or so of tests it seems everything is working ok now. I used the 2nd method you suggested, i.e using an initialization code.
Thanks again for the effort.
Ita.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had the same problem with a similar resolution
http://software.intel.com/en-us/forums/showthread.php?t=78131&p=1#132389
Except for the -z initfirst was needed in my case.
http://software.intel.com/en-us/forums/showthread.php?t=78131&p=1#132389
Except for the -z initfirst was needed in my case.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page