- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a program using MKL 9.1 for DFT and icc 11.1 on 32-bit Linux. If I do a particularly large calculation, and hit ctrl-C in mid-calculation, the ctrl-C is ignored, and the program then seems to go into an endless loop, using 100% CPU but never returning from the dft call. It is quite reproducible. The only way to get rid of the zombie at this point is a 'kill -9' or similar.
Attaching with the debugger at this point shows the following stack trace:
#0 0x0050e424 in __kernel_vsyscall ()
#1 0x001eed9c in sched_yield () from /lib/libc.so.6
#2 0x08daaeca in __kmp_yield ()
#3 0x08d94028 in __kmp_join_barrier ()
#4 0x08d94a51 in __kmp_join_call ()
#5 0x08d87b63 in __kmpc_fork_call ()
#6 0x083cc559 in mkl_dft_compute_backward_z_par ()
I can single-step up to the sched_yield() at which point the debugger also becomes useless.
Anyhelp would be greatly appreciated.
Attaching with the debugger at this point shows the following stack trace:
#0 0x0050e424 in __kernel_vsyscall ()
#1 0x001eed9c in sched_yield () from /lib/libc.so.6
#2 0x08daaeca in __kmp_yield ()
#3 0x08d94028 in __kmp_join_barrier ()
#4 0x08d94a51 in __kmp_join_call ()
#5 0x08d87b63 in __kmpc_fork_call ()
#6 0x083cc559 in mkl_dft_compute_backward_z_par ()
I can single-step up to the sched_yield() at which point the debugger also becomes useless.
Anyhelp would be greatly appreciated.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As I understood, the program reproducibly happens to be inside the lengthy dft call when you hit ctl-c. Then an example program (fft sizes, call sequence) would help us to reproduce the problem.
Probably, installing a signal handler for SIG_INT will help to react to the interrupt without leavingzombie processes.
Have in mind that MKL 9.1 is rather old version that may have compatibility issues with more recent OpenMP runtime of icc 11.1. Could you check what shared objects your application depends on (ldd a.out)?
Thanks
Dima
As I understood, the program reproducibly happens to be inside the lengthy dft call when you hit ctl-c. Then an example program (fft sizes, call sequence) would help us to reproduce the problem.
Probably, installing a signal handler for SIG_INT will help to react to the interrupt without leavingzombie processes.
Have in mind that MKL 9.1 is rather old version that may have compatibility issues with more recent OpenMP runtime of icc 11.1. Could you check what shared objects your application depends on (ldd a.out)?
Thanks
Dima
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Dima,
I tried a simple example and it does NOT reproduce the problem, so it may have something to do with the fact that our application uses threads and also catches signals. I'll try a few more things and post again when I have more useful info.
Thanks
Howard
I tried a simple example and it does NOT reproduce the problem, so it may have something to do with the fact that our application uses threads and also catches signals. I'll try a few more things and post again when I have more useful info.
Thanks
Howard

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page