- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found this thread (What set of events to use to profile the intra-processor and inter-processor NUMA cache coherence overhead) with some suggestions for NUMA cache coherence towards the bottom, that is helpful, but I am also looking for general memory information.
Also, some of the suggestions in the thread are dated and don't appear to exist in Update 2 (REMOTE_CACHE_LOCAL_HOME_HIT), or perhaps just not my processor.
I am relatively new to VTune so I hope this doesn't appear overly naive :-) Any help you can give would be greatly appreciated.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
First at all, I recommend this article for your reference.
If you set NUMA on in BIOS, so associated performance event counts can be used:
OFFCORE_RESPONSE_0.ANY_REQUEST.LOCAL_DRAM
OFFCORE_RESPONSE_0.ANY_REQUEST.REMOTE_DRAM
Above indicates memory access for all offcore cacheline traffic. There are similar events can be used:
MEM_UNCORE_RETIRED.LCOAL_DRAM
MEM_UNCORE_RETIRED.REMOTE_DRAM
Additionally the article provides many latency info (penalty) for offcore memory access
To evaluateData Latency Analysis Ratios caused by "Remote DRAM", the formula is:
"LLC Load Driven Misses - Remote DRAM" = 275 * MEM_UNCORE_RETIRED.REMOTE_DRAM / CPU_CLK_UNHALTED.THREAD
About using performance counts on VTune AmplifierXE 2011 Update directly (command line) - please refer to this article.
Regards, Peter
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page