Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4975 Discussions

Average Bandwidth on Xeon Machine

Ayam
Beginner
404 Views

Hello,

I am running the bandwidth analysis on the xeon machine using intel vtune. 

The summary of the result shows average bandwidth 

Average Bandwidth
Package    Bandwidth, GB/sec
package_0    6.718
package_1    7.657
 

Can you please explain me what package refers here? If application is running with single thread, which package value I should pick for the bandwidth.

Regards,

 

0 Kudos
7 Replies
Peter_W_Intel
Employee
404 Views

You need to observe both, not only for packages but also for cores.

Even you run a single thread application, which work on mores cores unless you use processor-affinity function.

0 Kudos
Dmitry_P_Intel1
Employee
404 Views

You can see how the threads of your application migrated over packages/cores changing viewpoint to "Hotspots" and timeline grouping to "Thread/HW Contest". If you have 2 packages with 4 cores each VTune will list them cpu_0,...cpu_3 for the first package and cpu_4..cpu_7 for the second one.

0 Kudos
Ayam
Beginner
404 Views

So if I have to tell the average bandwidth of the application. I will be taking average of above two packages' values.

0 Kudos
Bernard
Valued Contributor I
404 Views

Package is collection of physical cores.

Regarding your last post #4:

I think only if your application threads will be scheduled to run on the second package.

 

0 Kudos
Ayam
Beginner
404 Views

Thank you so much for the explanation.

0 Kudos
Bernard
Valued Contributor I
404 Views

You are welcome.

0 Kudos
McCalpinJohn
Honored Contributor III
404 Views

The average bandwidth is the sum of the values for the two packages -- not the average of the averages.

Even a single thread can generate memory traffic on both packages in several ways:

  1. The thread might move from one package to another and instantiate memory pages while running in package 0 and while running in package 1.
  2. The NUMA memory setting for the job (or for the system) might request interleaved pages.  This would usually result in approximately equal bandwidth utilization on the two packages.  (The values above are within 15% of each other, so interleaving is plausible.)
  3. The process might request more memory than is available in the package that it starts on, so that additional memory will be allocated on the other package.  You can monitor this on Linux systems by running "numastat" before and after your job and looking for large increases in the "numa_miss" output.

It is also possible that some of the memory traffic is due to other processes or to operating system activity.  The values here seem too high to blame on OS activity, but it is theoretically possible.   It is not generally possible to assign DRAM traffic to particular processes, so you need to ensure that you are running on an otherwise idle system to get reliable measurements.

0 Kudos
Reply