- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I ran the VTune Profiler to measure the 'DRAM Cache Hit Ratio' in Memory mode with Optane DCPMM. But I found that 'Local DRAM Access Count' and 'DRAM Cache Hits+Misses' were very different. I think the sum of 'DRAM Cache Hits' and 'DRAM Cache Misses' should be the same as 'Local DRAM Access Count', but 'DRAM Cache Hits' is much larger than 'Local DRAM Access Count'.
This figure shows the result of VTune attached to a process that conducts the 16GB sequential read. Local DRAM Access Count (255,017,850) multiplied by 64 is similar to 16GB, but DRAM Cache Hits (1,301,206,695) is much higher than the access count.
So, what do 'DRAM Cache Hits' and 'DRAM Cache Misses' mean in detail? Why is it different from 'Local DRAM Access Count'?
Also, why 'LLC Miss Count' is different with 'Local DRAM Access Count'? In some cases, the two values were exactly the same even though they were almost similar experiments. I used only one socket to avoid remote accesses.
Best regards,
Minjae Kim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The DRAM Cache metrics are based on uncore events which count all memory access passed through memory controller. While the Local DRAM Access one is based on core events which count only accesses caused by demand loads so e.g. accesses caused by hw prefetcher are not counted. Hence the difference.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The DRAM Cache metrics are based on uncore events which count all memory access passed through memory controller. While the Local DRAM Access one is based on core events which count only accesses caused by demand loads so e.g. accesses caused by hw prefetcher are not counted. Hence the difference.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
The difference in numbers for LLC Miss count and Local DRAM Access Count could be because the total number of LLC Miss count and Local DRAM Access Count is just too small for sampling to provide reliable counts. 250Mil is not that much considering that it is actually interpolated after multiplexing and summed across all cores.
Thanks
Arun
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the confirmation. If you need any additional information, please submit a new question as this thread will no longer be monitored.
Regards
Arun
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page