Stall Decomposition, "Cycle Accounting" Whitepaper, and FE+Scoreboard
I am profiling an application on Core 2, using the methodology presented in "Cycle Accounting, analysis on Intel Core2 Processors" and "Introduction to performance analysis on Intel Core 2 Duo Processors" both by Dr. Levinthal (available on the VTune website).
Almost all of my stalled cycles are due to "FE + Scoreboard." It seems that I'm saturating the FSB, as measured by an 80% bus utilization (BUS_TRANS_ANY.ALL_AGENTS). Is that correlation all I can get with the Core 2 PMU, or is there some measurement that will show more direct evidence that FSB is indeed the culprit?