Community support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4892 Discussions

Calculating Event Ratios with Multiple Runs


I've been trying to use VTune Amplifier XE 2011 to profile the memory access behavior of my application. In particular, I've been been trying to calculate the effect of llc misses on a Nehelem processor as (180 * MEM_LOAD_RETIRED.LLC_MISS) /CPU_CLK_UNHALTED.THREAD using the Memory Access collection type.

Based on some of the documentation I read, I set "Allow multiple runs" to avoid multiplexing and get more accurate results, but it's not clear to me how the results from multiple runs are being aggregated. In particular, it seems that the event counts listed in the "Summary" window are simply the sum of each count over all runs. Since some of the events are only counted during certain runs, it seems wrong to compute the ratio using these summary values. For example,MEM_LOAD_RETIRED.LLC_MISS is collected for only 1 of the 3 runs, whileCPU_CLK_UNHALTED.THREAD is collected for all 3 runs, so we can't just compare the total counts.

What's the correct way to compute an accurate ratio? Should I average each event count over the runs it was collected and then compute the llc misses from the averages?

0 Kudos
1 Reply

Hi Ben,

If you set "Allow multiple runs" on, and you already had many events selected (usually that isuser-defined hardware event-based sampling data collection), all events willbegrouped into several runs.

For example, I have a simple program which has5threads duringdata collecting, I selected 40 events to run, finally it had 12 runs - report showed 60 threads for summary. You know someevents are foundin specific run, in timeline - youcan seeUOPS_RETIRED.ANY in last run, not in other time range. See below -

You have toselect time rangethen do"Zoom-in/Filter-inon selection", to know CPU_CLK_UNHALTED.THREAD count from this time range (not from Summary), then use it in formula.

If CPU_CLK_UNHALTED.THREAD was not counted in time range, you can get it by calculating - delta_time_range (s) * CPU_Frequency.

Regards, Peter

0 Kudos