I've recently ported some code to a 64-bit CentOS 6 server that supports AVX instructions and I think I have encountered a bug with the MKL DFT routines when threading is enabled. When I try to take a 80640 point complex 1D forward DFT, I get a segfault if I set mkl_set_num_threads to any number greater than 1, yet the code works fine if I set mkl_set_num_threads(1). Not sure if this has been documented or encountered by others, but for me it seems to be limited to my 64-bit AVX platform as when I compile on a 64-bit SSE4.2 platform, the code runs fine with no segfault. I've attached the test code that I've been running to debug. For reference, I am compiling with:
icpc -O3 -xHost test.cpp -openmp -liomp5 -lpthread -lm -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread
Here are my system stats:
Compiler: intel compserXE 2013.0.79 (MKL v11.0)
OS: 64-bit Linux CentOS 6.4
CPU: Xeon E5email@example.comGHz
Also, when I run the core dump through gdb, I get the following back-trace:
Is this a bug or am I just doing something wrong with my DFT?
Could you have a check with the latest the MKL 11.0.3 release, and see if there is any problem? We noticed one bug in the old MKL 11.0 release on the AVX threading code on some problem sizes, and it was already fixed since MKL 11.0 update 2. Please suggest if you still any problem with the new release.
[tim@tim-cp net]$ ./a.out
With either of the 2 most recent compiler/MKL releases
cpu family : 6
model : 62
model name : Genuine Intel(R) CPU @ 2.50GHz
stepping : 2
Chao - Thank you for the suggestion, sorry for my delay in responding, I am waiting for my sys admin to update our MKL install to 11.0 update 2 or 3 (we need explicit approval before updating software). As soon as he does that I'll verify if the bug still occurs.
Sergey - I will check OMP_STACKSIZE tomorrow as well.
iliyapolak - Unfortunately posting the full backtrace is difficult since I don't have network access to the server I'm running on (which is at my office) and I don't have an easy way to move an electronic copy of the backtrace out of the office. If updating MKL does not resolve the issue I'll try and post a full backtrace.
Very sorry for the delayed response, my sys admin just installed MKL 11 update 3 on friday (we have a lengthy software approval process). I can confirm that upgrading to update 3 has fixed the problem, I am no longer able to reproduce the segfault regardless of threading or DFT size. Thanks for your help.