I have a program using MKL 9.1 for DFT and icc 11.1 on 32-bit Linux. If I do a particularly large calculation, and hit ctrl-C in mid-calculation, the ctrl-C is ignored, and the program then seems to go into an endless loop, using 100% CPU but never returning from the dft call. It is quite reproducible. The only way to get rid of the zombie at this point is a 'kill -9' or similar. Attaching with the debugger at this point shows the following stack trace:
#0 0x0050e424 in __kernel_vsyscall () #1 0x001eed9c in sched_yield () from /lib/libc.so.6 #2 0x08daaeca in __kmp_yield () #3 0x08d94028 in __kmp_join_barrier () #4 0x08d94a51 in __kmp_join_call () #5 0x08d87b63 in __kmpc_fork_call () #6 0x083cc559 in mkl_dft_compute_backward_z_par ()
I can single-step up to the sched_yield() at which point the debugger also becomes useless. Anyhelp would be greatly appreciated.
As I understood, the program reproducibly happens to be inside the lengthy dft call when you hit ctl-c. Then an example program (fft sizes, call sequence) would help us to reproduce the problem. Probably, installing a signal handler for SIG_INT will help to react to the interrupt without leavingzombie processes. Have in mind that MKL 9.1 is rather old version that may have compatibility issues with more recent OpenMP runtime of icc 11.1. Could you check what shared objects your application depends on (ldd a.out)?
I tried a simple example and it does NOT reproduce the problem, so it may have something to do with the fact that our application uses threads and also catches signals. I'll try a few more things and post again when I have more useful info.