Hi, I am tuning a program on Core 2 Duo processor. And the VTune I use is version 8.0.2 beta for windows with RDC installed on a Linux box.
But when I use Tuning Assistant, it says something like:
Bus Utilization is high: 90.13% bus cycles bus is in use
I am wondering which one is correct. And how does tuning assitant get this number?
Thanks for your reply.
I am presenting the bus utilization number in an article. So I must get the exact number. Is there anything I can do to decide which number is correct?
The 50% is the overall bus utilization for the entire program. Butover 90%of the CPU time is spent in a single function.
I am trying to identify the bottleneck ofthe program, which is a parallel program using OpenMP. It has poor speedup on our Core 2 Quad machine, having a speedup of 4.2 while using 8 threads (our machine has two Core2Quad CPUs). My guess is that the program is bandwidth limited. So I used VTune to get the bus utilization ratio, hoping to get an extremely high bandwidth. For example, if I get 80% bus utilization ratio at 4 threads, and 90% at 8 threads, I can conclude that the program is eating up all the bandwidth of the front system bus, and hence it's bandwidth limited.
The problem is that I don't know which number is correct. The number ofBUS_TRANS_ANY.ALL_AGENTS events is 428,400,000, and that of CPU_CLK_UNHALTED.BUS events is 1,551,200,000. So VTune tells that the bus utilization ratios is 428,400,000*2/1,551,200,000=55.2%. However, the Tuning Assitant reports a "bus utilization" of over 90%. I have to find out which is a right one.
I have found the reason for the inconsistency of "Bus Utilization" between VTune and Tuning Assistant. VTune shows the bus utilization of the current selected process/module, while Tuning Assistant caculates the total bus utilization of the system, including all processes.
I have found that a process called pid_0x0 causes many BUS_TRANS_ANY.ALL_AGENTS, sometimes even more than my program. That's why the bus utilization in Tuning Assistant is so much higher. I drilled down into that process, and found that most of the busevents are collected in the Linux kernel. I don't know what have caused this. Is it possible that the bus_trans events are actually caused by my program, while VTune counts it into the kernel? Or is it just some kind of kernel activity? The machine is used only by me, there is no possibility that other people is using the machine.
Have anyone countered this problem before?
The OSI am using is Fedora Core 2, and the processors are 2 Core 2 Quad CPUs. The VTune version is 8.0.2 beta for windows with RDC installed on the Linux box. I also tried VTune 8.0.4 for Linux, but the result is the same.