Export memory bandwidth data?

Tristan_J_1 · ‎02-25-2013

I'm using Intel VTune Amplifier XE 2013 to gather memory bandwidth usage for a particular application and was wondering if there's some way to export this data for additional analysis with other tools. In particular, I'd love to be able to export a particular capture to an Excel file so that I can calculate things like average and standard deviation on the results over time. Right now only seem to be able to browse the data in the VTune application and drill down on individual points.

I haven't been able to find any thing in the UI that might be able to enable this. Maybe there is a command line option somewhere?

Many thanks!

Peter_W_Intel · ‎02-25-2013

Please try in this way, command line:

#amplxe-cl -collect snb-bandwidth -- ./program

# amplxe-cl -report hw-events -format csv -csv-delimiter=","

Note: 1. You can redirect the ouputs to an excel file. 2. All are summary data of hardware event (LLC Miss on local or remote, caused a memory access), no time stamp info.

Tristan_J_1 · ‎02-26-2013

Thanks for the reply Peter.

I tried that, but that command only appears to output the aggregate data from the run that appears at the bottom of a memory bandwidth analysis. The data that I'm most interested in exporting is the observed memory bandwidth over time... which is the part that is graphed at the tob of a memory bandwidth analysis.

I've attached an image from the VTune Amplifier UI that highlights the data that I'm looking to export.

Thanks!

Peter_W_Intel · ‎02-26-2013

Tristan J. wrote:

Thanks for the reply Peter.

I tried that, but that command only appears to output the aggregate data from the run that appears at the bottom of a memory bandwidth analysis. The data that I'm most interested in exporting is the observed memory bandwidth over time... which is the part that is graphed at the tob of a memory bandwidth analysis.

I've attached an image from the VTune Amplifier UI that highlights the data that I'm looking to export.

Thanks!

I know that you need to see overtime data, but it is not feasible for exporting. As I said last post, only summar data can be exported - you may find hot functions and know how freqent they have local/remote DRAM access.

Tristan_J_1 · ‎03-06-2013

Thanks. I would definitely vote for allowing the bandwidth data over time be exportable in a future update. I would definitely find value in using that data for more in-depth analysis.

Peter_W_Intel · ‎03-06-2013

Tristan J. wrote:

Thanks. I would definitely vote for allowing the bandwidth data over time be exportable in a future update. I would definitely find value in using that data for more in-depth analysis.

I have escalated this new feature request to dev team. I will update this thread if the feature is ready.

Surya_Narayanan_N_ · ‎01-14-2014

is this feature ready yet?

Peter_W_Intel · ‎01-14-2014

Surya Narayanan N. wrote:

is this feature ready yet?

It seems not ready yet for timeline report, in command line.

Surya_Narayanan_N_ · ‎01-16-2014

Ok, I would like to know the bandwidth computation using knc-bandwidth, the summary looks like this

CPU
---
Parameter bw_org_2
----------------- -----------------------------
Frequency 1052000000
Logical CPU Count 240
Name Intel(R) Xeon(R) E5 processor

Summary
-------
Elapsed Time: 2.984
CPU Usage: 2.893

Event summary
-------------
Hardware Event Type Hardware Event Count:Self Hardware Event Sample Count:Self Events Per Sample
------------------- ------------------------- -------------------------------- -----------------
CPU_CLK_UNHALTED 9140000000 914 10000000

Uncore Event summary
--------------------
Hardware Event Type Hardware Event Count:Self
----------------------------- -------------------------
UNC_F_CH0_NORMAL_WRITE[UNIT0] 10103918
UNC_F_CH0_NORMAL_WRITE[UNIT1] 10109727
UNC_F_CH0_NORMAL_WRITE[UNIT2] 10095707
UNC_F_CH0_NORMAL_WRITE[UNIT3] 10102520
UNC_F_CH0_NORMAL_WRITE[UNIT4] 10095936
UNC_F_CH0_NORMAL_WRITE[UNIT5] 10100786
UNC_F_CH0_NORMAL_WRITE[UNIT6] 10109940
UNC_F_CH0_NORMAL_WRITE[UNIT7] 10100599
UNC_F_CH0_NORMAL_READ[UNIT0] 8574334
UNC_F_CH0_NORMAL_READ[UNIT1] 8588694
UNC_F_CH0_NORMAL_READ[UNIT2] 8562949
UNC_F_CH0_NORMAL_READ[UNIT3] 8611755
UNC_F_CH0_NORMAL_READ[UNIT4] 8566964
UNC_F_CH0_NORMAL_READ[UNIT5] 8573352
UNC_F_CH0_NORMAL_READ[UNIT6] 8590854
UNC_F_CH0_NORMAL_READ[UNIT7] 8589209
UNC_F_CH1_NORMAL_WRITE[UNIT0] 10100922
UNC_F_CH1_NORMAL_WRITE[UNIT1] 10101943
UNC_F_CH1_NORMAL_WRITE[UNIT2] 10105199
UNC_F_CH1_NORMAL_WRITE[UNIT3] 10100272
UNC_F_CH1_NORMAL_WRITE[UNIT4] 10106579
UNC_F_CH1_NORMAL_WRITE[UNIT5] 10123764
UNC_F_CH1_NORMAL_WRITE[UNIT6] 10115382
UNC_F_CH1_NORMAL_WRITE[UNIT7] 10100624
UNC_F_CH1_NORMAL_READ[UNIT0] 8576649
UNC_F_CH1_NORMAL_READ[UNIT1] 8566361
UNC_F_CH1_NORMAL_READ[UNIT2] 8592849
UNC_F_CH1_NORMAL_READ[UNIT3] 8591451
UNC_F_CH1_NORMAL_READ[UNIT4] 8577494
UNC_F_CH1_NORMAL_READ[UNIT5] 8624924
UNC_F_CH1_NORMAL_READ[UNIT6] 8615670
UNC_F_CH1_NORMAL_READ[UNIT7] 8585177
amplxe: Executing actions 100 % done

But when i load the result file in GUI I see

Average Bandwidth
Package	Bandwidth, GB/sec
package_0	6.414

How is this 6.414GB/Sec computed?

Peter_W_Intel · ‎01-16-2014

This is an internal analysis type, and there is internal formula to compute Avg. Bandwidth by using counters, you can reference in file vtune_amplifier_xe_2013/config/query_library/uncore_metrics.cfg to know more.

Note that you should not modify this file, otherwise it will cause unexpected result.

Surya_Narayanan_N_ · ‎01-22-2014

Thank you, I also have a question on how the core bandwidth and uncore bandwidth numbers are validated? I tried to run just the copy part of the STREAM benchmark. Usually in xeon-phi it reaches a maximum of 140 GB/Sec for 240 threads. I achieve similar results when using core calculation but the uncore bandwidth shows something more than 250GB/Sec for 128 threads, which is not matching with the calim that both the way of bandwidth measurement gives similar numbers.

Peter_W_Intel · ‎01-23-2014

What I posted previous .cfg file has contents for different processor type, e.g. snb/ivybridge, haswell, snbep/ivytown, core7b, knc, etc.

knc - DataWrittenGB,

knc - DataReadGB,

knc - DataTransferGB,

If you have concern about VTune result, please send a ticket to Intel Premier Support, with your result directory - for investigating.