- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amitm02
Hi,
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
Hi Amit,
L2 cache misses introduce stalls in the CPU pipeline. The number of stalls is proportional to the number of L2 cache miss events with a factor of penalty. On the Core2 system the penalty is about 130 CPU clockticks (worst case) - check with the optimization manual for a particular microarchitecture. So the rough estimation in (CPU cycles) can be made by multiplying L2 cache miss events by penalty. Please, make sure you are counting events, not samples. You can see the number of events directly in VTune Hotspot results or have number of samples multiplied by SAV (sampling after value) for the events.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - amitm02
Hi,
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
I know how to measure the L2 cache miss of my functions. Is there a way to measure CPU stalls those misses cause (if any) in msec or cpu cycles?
If it can not be measured directly, can i estimate it by other events/ratios such as CPI etc'.
Thanks
Amit
Amit,
Vladimir has pointed out how to estimate the worst case impact.The out-of-order engine might hide some of the latency. An eventthat can give you more insightsis RS_UOPS_DISPATCHED.CYCLES_NONE. It measures the cycles in which no micro-op is dispatched for execution, i.e. the execution units are waiting for work. Obviously, there might be different reasons for this than cache misses, but this event can show you, if you have an issue.
Kind regards
Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Thomas Willhalm (Intel)
Amit,
Vladimir has pointed out how to estimate the worst case impact.The out-of-order engine might hide some of the latency. An eventthat can give you more insightsis RS_UOPS_DISPATCHED.CYCLES_NONE. It measures the cycles in which no micro-op is dispatched for execution, i.e. the execution units are waiting for work. Obviously, there might be different reasons for this than cache misses, but this event can show you, if you have an issue.
Kind regards
Thomas
Hello Amit,
You can also look at the cycle accounting from David Levinthal.
It will give you better idea of where your CPU cycles are utilized
assets.devx.com/goparallel/17775.pdf
Thanks,
Regards,
Dny
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page