- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm using Intel VTune Amplifier XE 2013 to gather memory bandwidth usage for a particular application and was wondering if there's some way to export this data for additional analysis with other tools. In particular, I'd love to be able to export a particular capture to an Excel file so that I can calculate things like average and standard deviation on the results over time. Right now only seem to be able to browse the data in the VTune application and drill down on individual points.
I haven't been able to find any thing in the UI that might be able to enable this. Maybe there is a command line option somewhere?
Many thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please try in this way, command line:
#amplxe-cl -collect snb-bandwidth -- ./program
# amplxe-cl -report hw-events -format csv -csv-delimiter=","
Note: 1. You can redirect the ouputs to an excel file. 2. All are summary data of hardware event (LLC Miss on local or remote, caused a memory access), no time stamp info.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply Peter.
I tried that, but that command only appears to output the aggregate data from the run that appears at the bottom of a memory bandwidth analysis. The data that I'm most interested in exporting is the observed memory bandwidth over time... which is the part that is graphed at the tob of a memory bandwidth analysis.
I've attached an image from the VTune Amplifier UI that highlights the data that I'm looking to export.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tristan J. wrote:I know that you need to see overtime data, but it is not feasible for exporting. As I said last post, only summar data can be exported - you may find hot functions and know how freqent they have local/remote DRAM access.
Thanks for the reply Peter.
I tried that, but that command only appears to output the aggregate data from the run that appears at the bottom of a memory bandwidth analysis. The data that I'm most interested in exporting is the observed memory bandwidth over time... which is the part that is graphed at the tob of a memory bandwidth analysis.
I've attached an image from the VTune Amplifier UI that highlights the data that I'm looking to export.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. I would definitely vote for allowing the bandwidth data over time be exportable in a future update. I would definitely find value in using that data for more in-depth analysis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tristan J. wrote:I have escalated this new feature request to dev team. I will update this thread if the feature is ready.
Thanks. I would definitely vote for allowing the bandwidth data over time be exportable in a future update. I would definitely find value in using that data for more in-depth analysis.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
is this feature ready yet?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Surya Narayanan N. wrote:
is this feature ready yet?
It seems not ready yet for timeline report, in command line.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, I would like to know the bandwidth computation using knc-bandwidth, the summary looks like this
CPU
---
Parameter bw_org_2
----------------- -----------------------------
Frequency 1052000000
Logical CPU Count 240
Name Intel(R) Xeon(R) E5 processor
Summary
-------
Elapsed Time: 2.984
CPU Usage: 2.893
Event summary
-------------
Hardware Event Type Hardware Event Count:Self Hardware Event Sample Count:Self Events Per Sample
------------------- ------------------------- -------------------------------- -----------------
CPU_CLK_UNHALTED 9140000000 914 10000000
Uncore Event summary
--------------------
Hardware Event Type Hardware Event Count:Self
----------------------------- -------------------------
UNC_F_CH0_NORMAL_WRITE[UNIT0] 10103918
UNC_F_CH0_NORMAL_WRITE[UNIT1] 10109727
UNC_F_CH0_NORMAL_WRITE[UNIT2] 10095707
UNC_F_CH0_NORMAL_WRITE[UNIT3] 10102520
UNC_F_CH0_NORMAL_WRITE[UNIT4] 10095936
UNC_F_CH0_NORMAL_WRITE[UNIT5] 10100786
UNC_F_CH0_NORMAL_WRITE[UNIT6] 10109940
UNC_F_CH0_NORMAL_WRITE[UNIT7] 10100599
UNC_F_CH0_NORMAL_READ[UNIT0] 8574334
UNC_F_CH0_NORMAL_READ[UNIT1] 8588694
UNC_F_CH0_NORMAL_READ[UNIT2] 8562949
UNC_F_CH0_NORMAL_READ[UNIT3] 8611755
UNC_F_CH0_NORMAL_READ[UNIT4] 8566964
UNC_F_CH0_NORMAL_READ[UNIT5] 8573352
UNC_F_CH0_NORMAL_READ[UNIT6] 8590854
UNC_F_CH0_NORMAL_READ[UNIT7] 8589209
UNC_F_CH1_NORMAL_WRITE[UNIT0] 10100922
UNC_F_CH1_NORMAL_WRITE[UNIT1] 10101943
UNC_F_CH1_NORMAL_WRITE[UNIT2] 10105199
UNC_F_CH1_NORMAL_WRITE[UNIT3] 10100272
UNC_F_CH1_NORMAL_WRITE[UNIT4] 10106579
UNC_F_CH1_NORMAL_WRITE[UNIT5] 10123764
UNC_F_CH1_NORMAL_WRITE[UNIT6] 10115382
UNC_F_CH1_NORMAL_WRITE[UNIT7] 10100624
UNC_F_CH1_NORMAL_READ[UNIT0] 8576649
UNC_F_CH1_NORMAL_READ[UNIT1] 8566361
UNC_F_CH1_NORMAL_READ[UNIT2] 8592849
UNC_F_CH1_NORMAL_READ[UNIT3] 8591451
UNC_F_CH1_NORMAL_READ[UNIT4] 8577494
UNC_F_CH1_NORMAL_READ[UNIT5] 8624924
UNC_F_CH1_NORMAL_READ[UNIT6] 8615670
UNC_F_CH1_NORMAL_READ[UNIT7] 8585177
amplxe: Executing actions 100 % done
But when i load the result file in GUI I see
Average Bandwidth | |
Package | Bandwidth, GB/sec |
package_0 | 6.414 |
How is this 6.414GB/Sec computed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is an internal analysis type, and there is internal formula to compute Avg. Bandwidth by using counters, you can reference in file vtune_amplifier_xe_2013/config/query_library/uncore_metrics.cfg to know more.
Note that you should not modify this file, otherwise it will cause unexpected result.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, I also have a question on how the core bandwidth and uncore bandwidth numbers are validated? I tried to run just the copy part of the STREAM benchmark. Usually in xeon-phi it reaches a maximum of 140 GB/Sec for 240 threads. I achieve similar results when using core calculation but the uncore bandwidth shows something more than 250GB/Sec for 128 threads, which is not matching with the calim that both the way of bandwidth measurement gives similar numbers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What I posted previous .cfg file has contents for different processor type, e.g. snb/ivybridge, haswell, snbep/ivytown, core7b, knc, etc.
knc - DataWrittenGB,
<valueEval><![CDATA[ ( ( query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT7]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT7]]") ) * 64 ) / 1000000000 ]]></valueEval>
knc - DataReadGB,
<derivedQuery idToOverwrite="DataReadGB">
<valueEval><![CDATA[ ( ( query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT7]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT7]]") ) * 64 ) / 1000000000 ]]></valueEval>
knc - DataTransferGB,
<derivedQuery idToOverwrite="DataTransferredGB">
<valueEval><![CDATA[ ( ( query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_READ[UNIT7]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_READ[UNIT7]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT0]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT1]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT2]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT3]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT4]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT5]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT6]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH0_NORMAL_WRITE[UNIT7]]") + query("/UncoreEventCount/UncoreEventType[UNC_F_CH1_NORMAL_WRITE[UNIT7]]") ) * 64 ) / 1000000000 ]]></valueEval>
If you have concern about VTune result, please send a ticket to Intel Premier Support, with your result directory - for investigating.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page