I'm profiling a well threaded rendering application (cpu raytracer). I'm trying to find out why some of the threads do not utilize all cpus/hyper threads all the time at 100%. In the platform view I can see that the cpu time of some threads is not at 100%, but as far as I can see the threads are running. I've attached a screenshot.
Can someone explain what is the meaning of 'cpu time' and 'running' in this graph? Also what does it mean to be 'running' but 'cpu time' to be less than 100%? Waiting for memory? Executing pause/halt/noop instructions?
I'm running on CentOS 6 Linux, V-Tune 2017 update 4, i7 5820k @ 3.3ghz (no turbo, no speedstep). I've seen similar effects when running on Windows 10, too. At the moment I'm profiling only with basic hotspots collecting mode.
Running in case of "Basic Hotspots" just provides the thread life time. When CPU Time for thread goes down, it means the thread has been scheduled off from CPU core (context switch happened) - this can be caused, for example, due to explicit synchronization (e.g. critical section or mutex) or preemption (another activity on the system). In case of "waiting for memory" or other CPU stalls the CPU Time metric would show 100%.
I'd suggest to run "Advanced Hotspots" with stacks - this analysis will collect context switches and you'll be able to analyze the reason of less CPU Time in the particular time frame.