We are trying to profile an in-house developed code and we noticed that at the top of the list, taking up 20% of the time, is a function called kmp_hyper_barrier_release. We are not compiling our code with OpenMP specifically, but we are using Pardiso in the MKL library which I believe does use OpenMP. The puzzling thing is that it seems that the code is only spending a fraction of the time inside the MKL library (~15%), so it is strange that this kmp function is taking up so much time. Even worse, there are two more kmp functions: kmp_x86_pause (taking up 11%), and kmp_execute_tasks (6%). I was wondering if anybody could explain what these functions do and why they are impacting the performance of our code so dramatically.
Note that bare-bones use of libiompprof5 simply writes a useful text summary in guide.gvs in your working directory.