Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4999 Discussions

Vtune shows very high number of instructions retired on '__kmp_wait_template<kmp_flag_64>'. What is it?

seongyun_k_
Beginner
952 Views

Hi,

I profiled my C++ application program which does parallelize using openmp.

As I wrote in the title, Vtune shows very high number of instructions retired on '__kmp_wait_template<kmp_flag_64>'.
(It's top-ranked one...) 

So I think the CPU resources are wasted in my code

What does the function '__kmp_wait_template<kmp_flag_64>' exactly do? 
Does it mean that there are huge workload skew between the threads?

스크린샷 2017-03-02 오전 11.34.28.png

0 Kudos
2 Replies
TimP
Honored Contributor III
952 Views

You may have noticed that a bug resembling this was present in libiomp5 prior to the 16.0.1 compiler release.  So, it would help if you would state your version or supply more information.

0 Kudos
Dmitry_P_Intel1
Employee
952 Views

Hello,

This is the time that was spent in runtime in spinning on barrier (imbalance) or waiting for parallel work by worker threads.

You can find this classification in the column on the right (pink cell).

I would recommend either to use "Analyze OpenMP Regions" knob on analysis type or use HPC Performance Characterization where it is done by default to learn about OpenMP use efficiency by the application per lexical OpenMP regions.

Thanks & Regards, Dmitry

 

0 Kudos
Reply