Community
cancel
Showing results for 
Search instead for 
Did you mean: 
pacard
Beginner
150 Views

How does Tuning Assistant caculate the "Bus Utilization"?

Hi, I am tuning a program on Core 2 Duo processor. And the VTune I use is version 8.0.2 beta for windows with RDC installed on a Linux box.

I tried to get the bus utilization of the program. The result VTune returned is about 50%, which, according to VTune Help, was caculated as BUS_TRANS_ANY.ALL_AGENTS * 2 / CPU_CLK_UNHALTED.BUS

But when I use Tuning Assistant, it says something like:

Bus Utilization is high: 90.13% bus cycles bus is in use

I am wondering which one is correct. And how does tuning assitant get this number?

Thx,

Alex

0 Kudos
5 Replies
TimP
Black Belt
150 Views

Either of those answers are high enough to indicate that bus utilization is a major issue in your performance. That is as clear an answer as I would expect from VTune, speaking in the generalities we are dealing with here. If you have 50% bus utilization for an entire program, you are doing very well, and must have higher utilization at times.
pacard
Beginner
150 Views

Hi tim18,

Thanks for your reply.

I am presenting the bus utilization number in an article. So I must get the exact number. Is there anything I can do to decide which number is correct?

Thx,

Alex

TimP
Black Belt
150 Views

You haven't told us whether, as I suspect, the 50% is overall bus utilization for the entire program, anda higher numberapplies to one function where those events occur. If so, that might be part of your answer. I don't know how an exact answer for this will help your paper, when additional questions to be answered would include why does it saturate at this value, is this meaningful in terms of practical performance issues, ....
pacard
Beginner
150 Views

Hi,

The 50% is the overall bus utilization for the entire program. Butover 90%of the CPU time is spent in a single function.

I am trying to identify the bottleneck ofthe program, which is a parallel program using OpenMP. It has poor speedup on our Core 2 Quad machine, having a speedup of 4.2 while using 8 threads (our machine has two Core2Quad CPUs). My guess is that the program is bandwidth limited. So I used VTune to get the bus utilization ratio, hoping to get an extremely high bandwidth. For example, if I get 80% bus utilization ratio at 4 threads, and 90% at 8 threads, I can conclude that the program is eating up all the bandwidth of the front system bus, and hence it's bandwidth limited.

The problem is that I don't know which number is correct. The number ofBUS_TRANS_ANY.ALL_AGENTS events is 428,400,000, and that of CPU_CLK_UNHALTED.BUS events is 1,551,200,000. So VTune tells that the bus utilization ratios is 428,400,000*2/1,551,200,000=55.2%. However, the Tuning Assitant reports a "bus utilization" of over 90%. I have to find out which is a right one.

Thanks,

Alex

pacard
Beginner
150 Views

Hi,

I have found the reason for the inconsistency of "Bus Utilization" between VTune and Tuning Assistant. VTune shows the bus utilization of the current selected process/module, while Tuning Assistant caculates the total bus utilization of the system, including all processes.

I have found that a process called pid_0x0 causes many BUS_TRANS_ANY.ALL_AGENTS, sometimes even more than my program. That's why the bus utilization in Tuning Assistant is so much higher. I drilled down into that process, and found that most of the busevents are collected in the Linux kernel. I don't know what have caused this. Is it possible that the bus_trans events are actually caused by my program, while VTune counts it into the kernel? Or is it just some kind of kernel activity? The machine is used only by me, there is no possibility that other people is using the machine.

Have anyone countered this problem before?

The OSI am using is Fedora Core 2, and the processors are 2 Core 2 Quad CPUs. The VTune version is 8.0.2 beta for windows with RDC installed on the Linux box. I also tried VTune 8.0.4 for Linux, but the result is the same.

Thx,

Alex

Reply