I am a newbie as far as as VTune is concerned. I have an OpenMP application that I am trying to analyse, since the delivered performance is way too far from expected. However, when I ran "Locks and Waits" and "Thread Concurency" I get no transitions whatsoever. The check box is selected, though. Please look at the following picture:
Additionally, I have different execution times than in "Thread Concurrency" - the latter match the execution times I have in practice.
I am using OpenMP to create threads (I have some sync points that are not shown at all by VTune) and I am compiling the code without optimizations - O0 - and g++.
Anything I am doing wrong?
I don't know why you can't see transition, but I tried my example without any problem.
Have you tried latest VTune Amplifier XE 2013 Update 15? I also used Intel C/C++ compiler.
[root@prc-mic01 pi_solution]# icc -g omp_pi.c -openmp -openmp-report -o omp_pi
omp_pi.c(16): (col. 1) remark: OpenMP DEFINED LOOP WAS PARALLELIZED
[root@prc-mic01 pi_solution]# export KMP_FORKJOIN_FRAMES=1
[root@prc-mic01 pi_solution]# amplxe-cl -c concurrency -- ./omp_pi
Please try attached example, I am attaching screen shot on my side.