Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5113 Discussions

Extract/dump bandwidth timeline data

Pramod_K_
Beginner
2,407 Views

Hello All,

Is it possible to dump bandwidth data over simulation time? (for example when I do memory-bandwidth analysis, I can see the read and write bandwidth for entire simulation time).

I know that one can measure related performance counters and calculate the bandwidth. But wondering if there is easy way to directly extract from timeline view. (for various optimisation types, I want to plot memory bandwidth variations).

Thanks!

0 Kudos
1 Solution
Dmitry_R_Intel1
Employee
2,407 Views

It is possible to get overtime bandwidth data in cli using experimental time report. Here is an example how to use it for getting bandwidth:

set AMPLXE_EXPERIMENTAL=time-cl

amplxe-cl -r <path_to_result> -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=50

The bin-count=50 parameter means that the whole duration will be split into 50 equal time ranges and you'll see bandwidth data for each of them. You can specify whatever number of bins you want.

View solution in original post

0 Kudos
15 Replies
Peter_W_Intel
Employee
2,407 Views

Is it possible to dump bandwidth data over simulation time?

That is impossible, VTune(TM) Amplifier only collect raw data during real time, but post-analyze data after data collection. The reason is to reduce overhead from VTune in service routine, I think.

0 Kudos
Pramod_K_
Beginner
2,407 Views

Thanks Peter! By "bandwidth data over simulation time" I mean :  once the simulation is complete, if there is a vtune command to post-analyze raw data and dump the bandwidth numbers that it shows as a timeline in GUI?

0 Kudos
Peter_W_Intel
Employee
2,407 Views

It seems there is no report type to use amplxe-cl to export data based on time scale, and also has no way to do this on timeline panel on GUI. I remember that this feature request has been escalated to engineering.

You can display overall data by using this:

# amplxe-cl -R summary -r r000bw

...

Summary
-------
Elapsed Time:       10.001

...

 

Uncore Event summary
--------------------
Hardware Event Type  Hardware Event Count:Self
-------------------  -------------------------
UNC_IMC_GT_REQUESTS                          0
UNC_IMC_IA_REQUESTS                     278943
UNC_IMC_IO_REQUESTS                   28189838
UNC_IMC_DATA_READS                    28360062
UNC_IMC_DATA_WRITES                     109036

 

0 Kudos
Dmitry_R_Intel1
Employee
2,408 Views

It is possible to get overtime bandwidth data in cli using experimental time report. Here is an example how to use it for getting bandwidth:

set AMPLXE_EXPERIMENTAL=time-cl

amplxe-cl -r <path_to_result> -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=50

The bin-count=50 parameter means that the whole duration will be split into 50 equal time ranges and you'll see bandwidth data for each of them. You can specify whatever number of bins you want.

0 Kudos
Pramod_K_
Beginner
2,407 Views

Dear Dmitry,

Thank you very much! I can now extract the bandwidth with above command. additional question:

Is it possible to specify the time window? I mean, our simulation takes first 100 seconds for initialisation and next 5-10 seconds for actual solver (test case for performance analysis). Is it possible to skip first 100 seconds? Otherwise it seems like I have to dump data with very large bin count. 

0 Kudos
Peter_W_Intel
Employee
2,407 Views

Can you add option "-time-filter=5:10" in report of CLI? It works for hotspots & hw-events report.

0 Kudos
Pramod_K_
Beginner
2,407 Views

Peter, what is the unit for duration? Does that mean show values for 5 milliseconds to 10 milliseconds from timeline? Or in seconds? Is this documented somewhere? 

0 Kudos
Peter_W_Intel
Employee
2,407 Views

Here I copied from online helper -

Description
Use the time-filter option to filter the report and display data for the specified time range only. For example, -time-filter=2.3:5.4 reports data collected from 2.3 seconds to 5.4 seconds of Elapsed time.

Examples
$ amplxe-cl -R hotspots -time-filter=2.3:5.4 

 

0 Kudos
Pramod_K_
Beginner
2,407 Views

Hi Dmitry, Peter,

I haven't understand this completely: for some bandwidth analysis runs can see the values reported by "amplxe-cl -r <path_to_result> -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=N" but many times it doesn't show anything. See the screenshot where you can see the bandwidth values in timeline but command line doesn't shows any values. I tried different bin counts (very small to very large). 

Any idea?

intel_bandwidth_q.png

0 Kudos
Peter_W_Intel
Employee
2,407 Views

I have no luck to use this unpublished feature - "amplxe-cl -r <path_to_result> -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=N" , but you said - " I can now extract the bandwidth with above command."

I just answered your additional question for displaying report in time range.

0 Kudos
Dmitry_R_Intel1
Employee
2,407 Views

Can you provide a result directory for which the report shows no data?

As for time window specifying - currently the report doesn't support '-time-filter' option but you can use report-sepcific '-r-k start=<start_time>' and '-r-k end=<end_time>' options. The time values should be specified in 10GHz ticks (so 1 second corresponds to 10 billions). E.g. following command:

amplxe-cl -r <path_to_result> -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=50 -r-k start=10000000000 -r-k end=30000000000

will provide data for time window from 1 to 3 seconds.

 

Please note that since this report is a preview feature it may or may not appear in future product versions and/or may undertake significant changes. 

0 Kudos
Pradeep_R_
Beginner
2,407 Views

This is a very cool feature, in my opinion, as it gives us the ability to generate plots the way we like to analyze the data!

However, I am having some trouble understanding this command line. When I ran a particular executable on a HSW i7 system, I saw on the vtunes bandwidth analysis GUI that the peaks were close to 10GBps; the total execution time was ~50s. However, when I dumped out the bandwidth into a text file on a per-millisecond basis by using the command line and taking it to excel, I see considerably higher peaks of close to 20GBps. Then, when I selected a time window where I saw a higher peak from excel and zoomed into the same time-window in the vtunes GUI, I see that there are higher peaks in that window also but was lost to averaging in the vtunes.

Has anyone else experienced this?

My question is, when the vtunes GUI displays bandwidth, what is the window that it uses to compute the average? It may be useful to provide a feature that allows us to select the window size in the GUI so that issues like this aren't missed when doing a bandwidth analysis.

Pradeep.

0 Kudos
Peter_W_Intel
Employee
2,407 Views

> I saw on the vtunes bandwidth analysis GUI that the peaks were close to 10GBps; the total execution time was ~50s. However, when I dumped out the bandwidth into a text file on a per-millisecond basis by using the command line and taking it to excel, I see considerably higher peaks of close to 20GBps.

It's better that you can provide a memory analysis result so look into...

The important thing is that this feature still is not public, it means that the user needs to do "export AMPLXE_EXPERIMENTAL=time-cl" first, then use "amplxe-cl -r r00?macc -R time -r-k column-by=OvertimeBandwidth -r-k bin-count=N -r-k start=m1 -r-k end=m2" ; m1,m2 is ms basis.

0 Kudos
Pradeep_R_
Beginner
2,407 Views

Peter,

Thanks for your response. I understand that this is not yet a public feature and I am using the right command lines. However, I would like to confirm that although it isn't a public feature, it should be doing the rigth thing, right :-)?

Also, I'm not sure how easy it is to share my bandwidth analysis result as the folder is ~2GB of data! But I can bet that this should be visible on any system where there are higher peaks but the average bandwidth is something like 4-5GBps.

I think that it would be easier for us to understand if you can answer the follow-on question that I had - "when the vtunes GUI displays bandwidth, what is the window that it uses to compute the average?"

 

Thanks,

Pradeep.

0 Kudos
Peter_W_Intel
Employee
2,407 Views

@ Pradeep

Yes. We need to do right tings right absolutely:-) I ran simple test case that peak was 3.6GB, and average was 1.37GB per second. You said, peak on GUI is 10GBps but 20GBs by exporting data to excel file, hard to image! That was why I asked your result...was it possible that you shorten duration for data collection because of result directory's size?

0 Kudos
Reply