Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

OpenMP hangs on pthread_cond_wait

Javier_T_
Beginner
3,158 Views

Hello,

we are trying to speed up a parallel program using the Intel Compiler and the OpenMP library.

We have observed that the program hangs after running ok for 3-4 days, in one of the parallel loops. The binary keeps running but stays in a permanent waiting state. Here is the gdb stack trace:

#0 0x00007fefbc1bf705 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fefbc98e9ce in __kmp_suspend_template (th_gtid=, flag=) at ../../src/z_Linux_util.c:1819
#2 __kmp_suspend_64 (th_gtid=-1145176444, flag=0x80) at ../../src/z_Linux_util.c:1874
#3 0x00007fefbc92fe08 in suspend (this=, th_gtid=) at ../../src/kmp_wait_release.h:405
#4 __kmp_wait_template (this_thr=, flag=, final_spin=, itt_sync_obj=) at ../../src/kmp_wait_release.h:224
#5 wait (this=, this_thr=, final_spin=, itt_sync_obj=) at ../../src/kmp_wait_release.h:414
#6 __kmp_hyper_barrier_gather (bt=3149790852, this_thr=0x80, gtid=1, tid=-1, reduce=0x7fefbbbdfe00, itt_sync_obj=0x0) at ../../src/kmp_barrier.cpp:510
#7 0x00007fefbc9331c3 in __kmp_join_barrier (gtid=-1145176444) at ../../src/kmp_barrier.cpp:1364
#8 0x00007fefbc959ee2 in __kmp_internal_join (id=0x7fefbbbdfe84, gtid=128, team=0x1) at ../../src/kmp_runtime.c:7142
#9 0x00007fefbc9609a4 in __kmp_join_call (loc=0x7fefbbbdfe84, gtid=128, exit_teams=1) at ../../src/kmp_runtime.c:2322
#10 0x00007fefbc9345bd in __kmpc_fork_call (loc=0x7fefbbbdfe84, argc=128, microtask=0x1) at ../../src/kmp_csupport.c:326
#11 0x000000000045f4fb in CovarianceMatrixCxx::kalman (......) at G2/CovarianceMatrixCxx.cpp:295
 

So far, we have observed this problem using Scientfic Linux 7 (glibc 2.17), but not using Scientific Linux 6 (glibc 2.12), when using "icc -openmp". The behaviour is the same when using "icc -fopenmp".

When we use the GNU C++ Compiler with OpenMP (g++ -fopenmp), the program runs fine in both SL6 and SL7. 

 

In summary, this looks like a problem when combining the Intel Compiler and SL7, when the OpenMP library is used. This is the ICC version we are using:

 

> icc --version

icc (ICC) 15.0.1 20141023

Unfortunately, the program is quite complex and we have not been able to generate a simplified version of the problem that can be easily reproduced. Below is an snapshoot of the loop that is hanging.

Are you aware of any kind of problem similar to this one?

Any help is appreciated,

Javier

                        #pragma omp parallel for private(i,j,k)
                        for(i=0; i
                        {
                                v_temp = 0.0;

                                for(k=0; k
                                        j = index_OfNonZero;

                                        if(j >= i) {
                                                v_temp += v_Matrix*v_KalmanVec.d_A;
                                        } else {
                                                v_temp += v_Matrix*v_KalmanVec.d_A;
                                        }

                                }//for k
                        }//for i

0 Kudos
30 Replies
Dhairya_M_
Beginner
1,040 Views

I have a similar issue and have reproduced it on two machines with different compiler versions

  1. Intel Xeon E5-2687W with icpc version 16.0.0
  2. Intel Xeon E5-2680 with icpc version 15.0.2

I also tried KMP_BLOCKTIME=infinite with intel/16.0, but it had no effect. I used pstack to get the following stack trace:

cafu:~>pstack 10619
Thread 17 (Thread 0x7ffb73f1d700 (LWP 10621)):
#0  0x00007ffb6de9ba82 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb70509e14 in __kmp_launch_monitor (thr=0x7ffb70780a44 <__kmp_wait_cv+4>) at ../../src/z_Linux_util.c:938
#2  0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 16 (Thread 0x7ffb653fe780 (LWP 10622)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700669828, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700669828, this_thr=0x80, gtid=66617, tid=-1, propagate_icvs=1700669696, itt_sync_obj=0x821c) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700669828, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655e2984) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655e2984) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x7ffb64ffd800 (LWP 10623)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700664452, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700664452, this_thr=0x80, gtid=66565, tid=-1, propagate_icvs=1700664320, itt_sync_obj=0x8202) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700664452, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655e1484) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655e1484) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x7ffb64bfc880 (LWP 10624)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700593284, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700593284, this_thr=0x80, gtid=66557, tid=-1, propagate_icvs=1700593152, itt_sync_obj=0x81fe) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700593284, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655cfe84) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655cfe84) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x7ffb57efe900 (LWP 10625)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700587908, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ab241 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_gather (bt=1700587908, this_thr=0x80, gtid=71613, tid=-1, reduce=0x7ffb655ce900, itt_sync_obj=0x8bde) at ../../src/kmp_barrier.cpp:704
#7  0x00007ffb704af432 in __kmp_join_barrier (gtid=1700587908) at ../../src/kmp_barrier.cpp:1742
#8  0x00007ffb704d9ff8 in __kmp_launch_thread (this_thr=0x7ffb655ce984) at ../../src/kmp_runtime.c:6056
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655ce984) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x7ffb57afd980 (LWP 10626)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700582532, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700582532, this_thr=0x80, gtid=66627, tid=-1, propagate_icvs=1700582400, itt_sync_obj=0x8221) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700582532, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655cd484) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655cd484) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x7ffb576fca00 (LWP 10627)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700560516, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700560516, this_thr=0x80, gtid=66551, tid=-1, propagate_icvs=1700560384, itt_sync_obj=0x81fb) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700560516, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655c7e84) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655c7e84) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7ffb572fba80 (LWP 10628)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700555140, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700555140, this_thr=0x80, gtid=19589, tid=-1, propagate_icvs=1700555008, itt_sync_obj=0x2642) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700555140, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655c6984) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655c6984) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7ffb56efab00 (LWP 10629)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700549764, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1700549764, this_thr=0x80, gtid=66457, tid=-1, propagate_icvs=1700549632, itt_sync_obj=0x81cc) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1700549764, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb655c5484) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb655c5484) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7ffb56af9b80 (LWP 10630)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681800836, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1681800836, this_thr=0x80, gtid=66617, tid=-1, propagate_icvs=1681800704, itt_sync_obj=0x821c) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1681800836, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb643e3e84) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643e3e84) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7ffb566f8c00 (LWP 10631)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681795460, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1681795460, this_thr=0x80, gtid=66617, tid=-1, propagate_icvs=1681795328, itt_sync_obj=0x821c) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1681795460, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb643e2984) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643e2984) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7ffb562f7c80 (LWP 10632)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681790084, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1681790084, this_thr=0x80, gtid=66597, tid=-1, propagate_icvs=1681789952, itt_sync_obj=0x8212) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1681790084, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb643e1484) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643e1484) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7ffb55ef6d00 (LWP 10633)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681669764, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ab241 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_gather (bt=1681669764, this_thr=0x80, gtid=71615, tid=-1, reduce=0x7ffb643c3e00, itt_sync_obj=0x8bdf) at ../../src/kmp_barrier.cpp:704
#7  0x00007ffb704af432 in __kmp_join_barrier (gtid=1681669764) at ../../src/kmp_barrier.cpp:1742
#8  0x00007ffb704d9ff8 in __kmp_launch_thread (this_thr=0x7ffb643c3e84) at ../../src/kmp_runtime.c:6056
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643c3e84) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7ffb55af5d80 (LWP 10634)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681664388, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1681664388, this_thr=0x80, gtid=66651, tid=-1, propagate_icvs=1681664256, itt_sync_obj=0x822d) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1681664388, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb643c2984) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643c2984) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7ffb556f4e00 (LWP 10635)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1681659012, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1681659012, this_thr=0x80, gtid=66615, tid=-1, propagate_icvs=1681658880, itt_sync_obj=0x821b) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1681659012, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb643c1484) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb643c1484) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7ffb552f3e80 (LWP 10636)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1686027908, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ae847 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_release (bt=1686027908, this_thr=0x80, gtid=19533, tid=-1, propagate_icvs=1686027776, itt_sync_obj=0x2626) at ../../src/kmp_barrier.cpp:778
#7  0x00007ffb704afd59 in __kmp_fork_barrier (gtid=1686027908, tid=128) at ../../src/kmp_barrier.cpp:1923
#8  0x00007ffb704d9f20 in __kmp_launch_thread (this_thr=0x7ffb647ebe84) at ../../src/kmp_runtime.c:6008
#9  0x00007ffb70508ee3 in __kmp_launch_worker (thr=0x7ffb647ebe84) at ../../src/z_Linux_util.c:786
#10 0x00007ffb6de97dc5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb6b32fced in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7ffb73ef7d80 (LWP 10619)):
#0  0x00007ffb6de9b6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007ffb7050eebe in __kmp_suspend_template (th_gtid=<optimized out>, flag=<optimized out>) at ../../src/z_Linux_util.c:1851
#2  __kmp_suspend_64 (th_gtid=1700673412, flag=0x80) at ../../src/z_Linux_util.c:1906
#3  0x00007ffb704ab241 in suspend (this=<optimized out>, th_gtid=<optimized out>) at ../../src/kmp_wait_release.h:549
#4  __kmp_wait_template (this_thr=<optimized out>, flag=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:342
#5  wait (this=<optimized out>, this_thr=<optimized out>, final_spin=<optimized out>, itt_sync_obj=<optimized out>) at ../../src/kmp_wait_release.h:558
#6  __kmp_hyper_barrier_gather (bt=1700673412, this_thr=0x80, gtid=52113, tid=-1, reduce=0x7ffb655e3700, itt_sync_obj=0x65c8) at ../../src/kmp_barrier.cpp:704
#7  0x00007ffb704af432 in __kmp_join_barrier (gtid=1700673412) at ../../src/kmp_barrier.cpp:1742
#8  0x00007ffb704d64d2 in __kmp_internal_join (id=0x7ffb655e3784, gtid=128, team=0xcb91) at ../../src/kmp_runtime.c:7791
#9  0x00007ffb704dcbc6 in __kmp_join_call (loc=0x7ffb655e3784, gtid=128, exit_teams=52113) at ../../src/kmp_runtime.c:2681
#10 0x00007ffb704b06e1 in __kmpc_fork_call (loc=0x7ffb655e3784, argc=128, microtask=0xcb91) at ../../src/kmp_csupport.c:342
#11 0x00000000004e5676 in SphericalHarmonics<double>::RotateAll(pvfmm::Vector<double> const&, long, long, pvfmm::Vector<double>&) ()
#12 0x00000000004eae5e in void SphericalHarmonics<double>::StokesSingularInteg_<false, true>(pvfmm::Vector<double> const&, long, long, pvfmm::Vector<double>&, pvfmm::Vector<double>&) ()
#13 0x00000000004d71ee in SphericalHarmonics<double>::StokesSingularInteg(pvfmm::Vector<double> const&, long, long, pvfmm::Vector<double>*, pvfmm::Vector<double>*) ()
#14 0x00000000004cb657 in StokesVelocity<double>::operator()() ()
#15 0x00000000005fabcb in StokesVelocity<double>::MonitorError() ()
#16 0x00000000004c2bfa in InterfacialVelocity<Surface<Scalars<double, Device<(DeviceType)0>, cpu>, Vectors<double, Device<(DeviceType)0>, cpu> >, VesInteraction<double> >::Prepare(SolverScheme const&) const ()
#17 0x00000000004c0ea7 in InterfacialVelocity<Surface<Scalars<double, Device<(DeviceType)0>, cpu>, Vectors<double, Device<(DeviceType)0>, cpu> >, VesInteraction<double> >::updateImplicit(Surface<Scalars<double, Device<(DeviceType)0>, cpu>, Vectors<double, Device<(DeviceType)0>, cpu> > const&, double const&, Vectors<double, Device<(DeviceType)0>, cpu>&) ()
#18 0x0000000000420548 in EvolveSurface<double, Device<(DeviceType)0>, cpu, VesInteraction<double>, Repartition<double> >::Evolve() ()
#19 0x0000000000409c37 in main ()
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,040 Views

Please read posts #8 and #10

Your call stack is listing source lines (debug info) for kmp_... routines.
And the dump does not show "... from ... /libiomp5.so"

Where are your __kmp_... functions located (should be in libiomp5.so in /opt/intel/compilers_and_libraries... path).

Jim Dempsey

0 Kudos
Martyn_C_Intel
Employee
1,040 Views

As per my previous post, please use the version 16.0.1 compiler or later.

0 Kudos
Dhairya_M_
Beginner
1,040 Views

The files libiomp5.a, libiomp5.dbg, libiomp5.so are located in: /opt/apps/sysnet/intel/16/compilers_and_libraries_2016.0.109/linux/compiler/lib/intel64_lin/

Additional information: CPU: dual socket Intel Xeon CPU E5-2687W

Operating system: CentOS Linux release 7.2.1511 (Core)

Kernel version: 3.10.0-327.18.2.el7.x86_64

I get the following information from setting KMP_VERSION environment variable:

Intel(R) OMP Copyright (C) 1997-2015, Intel Corporation. All Rights Reserved.
Intel(R) OMP version: 5.0.20150609
Intel(R) OMP library type: performance
Intel(R) OMP link type: dynamic
Intel(R) OMP build time: 2015-06-09 16:42:18 UTC
Intel(R) OMP build compiler: Intel C++ Compiler 15.0
Intel(R) OMP alternative compiler support: yes
Intel(R) OMP API version: 4.0 (201307)
Intel(R) OMP dynamic error checking: no
Intel(R) OMP thread affinity support: not used
Intel(R) OMP debugger support version: 1.1
Intel(R) OMP Intel(R) RML support: not using

 

0 Kudos
Dhairya_M_
Beginner
1,040 Views

Thanks, Martyn. I was hoping there would be a workaround (like KMP_BLOCKTIME=infinite, which didn't work). I am running the code on the Stampede supercomputer at TACC and it may be several months before the system administrators install the newer compiler version.

0 Kudos
Martyn_C_Intel
Employee
1,040 Views

No, this was a bug where a counter could overflow 32 bits, so there isn't a workaround (except to reduce the number of times you invoke the parallel region). But the compiler with the fix was released in November 2015, which is a while ago (there have been two more updates since). So I think you could make a good case to the Stampede administrators for an update (which can be side-by-side, so no effect on existing compilers).

0 Kudos
leyong_t_
Beginner
1,040 Views

Hi Everyone, 

   Does this bug impact the program linking with MKL older library (MKL build: 20150730, MKL version: 11.3, MKL Update version: 0) ?

   I am seeing a very similar issue on my program. It hangs after a while. My program is compiled using GCC 4.9 and linked with Intel MKL and TBB.

    Is updating to the latest mkl and tbb library will fixing this? Thanks.

   The hang threads stack trace looks like this:

Thread 405 (LWP 86140):
#0  0x00007f262f3b7d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /local/rekognition/processor/lib/libpthread.so.0
#1  0x00007f262f65a49b in __kmp_suspend_template<kmp_flag_64> (flag=0x7f206cdda630, 
th_gtid=<optimized out>) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/z_Linux_util.c:1798
#2  __kmp_suspend_64 (th_gtid=<optimized out>, flag=0x7f206cdda630)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/z_Linux_util.c:1847
#3  0x00007f262f66985e in kmp_flag_64::suspend (
th_gtid=8, this=0x7f206cdda630) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:466
#4  __kmp_wait_template<kmp_flag_64> (final_spin=0, itt_sync_obj=0x0, flag=0x7f206cdda630, this_thr=0x7f2294766e80)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:258
#5  kmp_flag_64::wait (itt_sync_obj=0x0, final_spin=0, this_thr=0x7f2294766e80, this=0x7f206cdda630)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:476
#6  __kmp_hyper_barrier_gather (bt=bs_forkjoin_barrier, reduce=0x0, gtid=8, itt_sync_obj=0x0, tid=0, this_thr=0x7f2294766e80)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_barrier.cpp:507
#7  __kmp_join_barrier (gtid=gtid@entry=8)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_barrier.cpp:1458
#8  0x00007f262f63d378 in __kmp_internal_join (
id=<optimized out>, gtid=8, team=0x7f2294761700)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_runtime.c:7114
#9  0x00007f262f63d696 in __kmp_join_call (
loc=0x7f25d77ed6f4 <.2.96_2_kmpc_loc_struct_pack.131>, gtid=8, exit_teams=0)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_runtime.c:2395
#10 0x00007f262f62f085 in __kmpc_fork_call (
loc=0x7f25d77ed6f4 <.2.96_2_kmpc_loc_struct_pack.131>, argc=9, microtask=0x7f25d685ed48 <gemm_omp_driver_v2+552>)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_csupport.c:356
#11 0x00007f25d685ec71 in gemm_omp_driver_v2 () from /local/rekognition/processor/lib/libmkl_intel_thread.so
#12 0x00007f25d685e70f in mkl_blas_sgemm_host () from /local/rekognition/processor/lib/libmkl_intel_thread.so
#13 0x00007f25d68865f5 in mkl_blas_sgemm () from /local/rekognition/processor/lib/libmkl_intel_thread.so
#14 0x00007f25d5db135f in sgemm_ () from /local/rekognition/processor/lib/libmkl_intel_lp64.so
#15 0x00007f262f7c6610 in sgemm_ () from /local/rekognition/processor/lib/libmkl_rt.so
#16 0x00007f25d5df1154 in cblas_sgemm () from /local/rekognition/processor/lib/libmkl_intel_lp64.so
#17 0x000000000074f03b in void caffe::caffe_cpu_gemm<float>(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, int, int, int, float, float const*, float const*, float, float*) ()

=================================================================

And like this 

Thread 627 (LWP 86391):
#0  0x00007f262f3b7d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /local/rekognition/processor/lib/libpthread.so.0
#1  0x00007f262f65a49b in __kmp_suspend_template<kmp_flag_64> (flag=0x7f21447f5310, 
th_gtid=<optimized out>) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/z_Linux_util.c:1798
#2  __kmp_suspend_64 (th_gtid=<optimized out>, flag=0x7f21447f5310)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/z_Linux_util.c:1847
#3  0x00007f262f66c0f4 in kmp_flag_64::suspend (
th_gtid=102, this=0x7f21447f5310) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:466
#4  __kmp_wait_template<kmp_flag_64> (itt_sync_obj=<optimized out>, final_spin=1, flag=0x7f21447f5310, this_thr=0x7f1e97361ec0)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:258
#5  kmp_flag_64::wait (itt_sync_obj=<optimized out>, final_spin=1, this_thr=0x7f1e97361ec0, this=0x7f21447f5310)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_wait_release.h:476
#6  __kmp_hyper_barrier_release (itt_sync_obj=<optimized out>, propagate_icvs=1, tid=-2, gtid=102, this_thr=0x7f1e97361ec0, 
    bt=bs_forkjoin_barrier) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_barrier.cpp:577
#7  __kmp_fork_barrier (gtid=gtid@entry=102, tid=tid@entry=-2)
    at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_barrier.cpp:1627
#8  0x00007f262f639c9f in __kmp_launch_thread (
this_thr=0x7f1e97361ec0) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/kmp_runtime.c:5521
#9  0x00007f262f65b8d2 in __kmp_launch_worker (
thr=0x7f1e97361ec0) at /local/p4clients/pkgbuild-AtJyq/workspace/src/OpenMP/build/private/src/src/z_Linux_util.c:767
#10 0x00007f262f3b3e9a in start_thread () from /local/rekognition/processor/lib/libpthread.so.0
#11 0x00007f2628e99c9d in clone () from /lib64/libc.so.6
0 Kudos
Martyn_C_Intel
Employee
1,040 Views

This bug was not in MKL as such, so updating MKL (or tbb) won't help of itself. The bug was in the OpenMP run-time library, libiomp5, so it depends which version of that you were using. (You can get an idea by setting KMP_VERSION=yes before you run your program). You should use the version of libiomp5.so shipped with the version 16.0.1 compiler or later; if you don't have the compiler, you can download the corresponding "redistributables", which include the OpenMP runtime. You can get these at https://software.intel.com/en-us/articles/redistributables-for-intel-parallel-studio-xe-2016-composer-edition-for-linux . I would try the package for 16.0 update 3.

If you are using tbb for other reasons, I believe it's possible to link to a version of MKL that uses tbb instead of OpenMP.  That might reduce the likelihood of scheduling conflicts between tbb threads and OpenMP threads. I'm not very knowledgeable about that, though.

0 Kudos
leyong_t_
Beginner
1,040 Views

Hi Marlyn,

Thanks for getting back to me.

Here is the OpenMP version information I got after using the Update 3 from the link you give me:

 

I have two questsions:

1. Should the bug be fixed with this version?

2. Do I need to recompile my code? Will the bug be fixed by just replacing the libiomp5.so at run time?

Thanks.

0 Kudos
Martyn_C_Intel
Employee
1,040 Views

Hi Leyong,

                 Yes, I expect that this library fixes the issues discussed in this thread. No, you should not need to recompile your application, assuming that the OpenMP library is linked dynamically. It should be sufficient to pick up  the new libiomp5.so at run-time. Let us know if it is not.

A new update to the 16.0 compiler, 16.0.4, has been released since my previous post. But the one you are using should be fine.

0 Kudos
Reply