I am a current user of the Intel Performance Counter tool, and I have found it to be very helpful for gathering information on DRAM power consumption and other things like the L2 and L3 cache hit ratio. I was wondering if this tool could be used to measure things like the row-buffer hit ratio or the number of bank conflicts. If not, do you know of any other tools or counters I could use to gather this information? If it makes any difference, my platform uses an Intel E5-26xx series processor with Linux as its operating system.
Thanks in advance.
The Xeon E5-2600 Family Uncore Performance Monitoring Guide (document 327043-001, March 2012) describes the available performance counters in the memory controller portion of the uncore in section 2.5.
Performance counter events are available to count page hits, page misses, and page conflicts -- BUT, you need to pay careful attention to the terminology. What Intel calls a "page miss" is usually referred to as a "page conflict". The event counts precharge commands that occur when the page is open, but has the wrong row in it. The other metrics can be obtained by arithmetic, as discussed below.
There are four programmable performance counters in each of the four memory controllers on each chip. Fortunately, events are available that allow you to get the most important data in a single run. My standard set is:
The derived events are:
These events are not perfect -- they don't count precharges that come from the "PRECHARGE ALL" command, for example -- but they seem to be quite reasonable in the tests I have run. ("PRECHARGE ALL" should only happen when refreshing the memory or in some power state changes and these are infrequent -- but there is a separate event for it if you want to see how often it happens.)
Of course these events also give the total memory traffic:
This agrees very closely with expectations in the experiments that I have done, but it is not going to be perfect. There are lots of special cases (uncached loads, partially completed streaming stores, IO references to memory, etc) that introduce a small amount of variability into the results.
In newer versions of Linux, the "perf" subsystem might support accessing these events relatively directly. I have not used the Intel Performance Counter Monitor tool recently, so I don't know if it has these events built-in.
Hi, I read the guide and I think the umask is wrong.
I think the umask of WR_CAS is 0x0C. Am I wrong or I missed something here?