- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
how do L2_TRANS events differ from L2_RQSTS event, e.g. for STRAM 1 thread into memory, i got following values:
L2_RQSTS_REFERENCES 2947425959
L2_TRANS_ALL_REQUESTS 5553687404
L2_RQSTS_ALL_DEMAND_DATA_RD 1184296149
L2_TRANS_DEMAND_DATA_RD 1508612495
Best,
Bo
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like the "L2_RQSTS" (0x24) events only count transactions that succeed, while "L2_TRANS" (0xF0) events also count attempted transactions that are rejected (and later retried). This interpretation is based on a few words that appear in the descriptions of some of the sub-events for some of the processor families in Chapter 19 of Volume 3 of the Software Developer's Manual, and is supported (for at least a subset of the available sub-events) by my microbenchmark experiments.
Your numbers for Demand Data Read are reasonably consistent with this interpretation, suggesting that ~27% of the Demand Data Read transactions are retried. (I have found that this happens when a Demand Data Read from an L1 Data Cache miss tries to access the L2 tags in the same cycle as an L2 HW prefetch, but there are almost certainly many other possible causes of retries.)
Your numbers for "all requests" are a bit harder to understand. The L2_RQSTS.REFERENCES is about 2.5 times the number of DEMAND_DATA_RD, which does not seem unreasonable. The expected ratio depends on how L1 HW prefetches are counted, how L2 HW prefetches are counted, how streaming stores are counted (if they are used in your STREAM binary), etc. The ratio of L2_TRANS.ALL_REQUESTS to L2_RQSTS.REFERENCES is about 1.88:1, which seems high to me, but it is possible that there are other differences in transactions that are counted by these two events (other than just retries). The documentation is insufficient to conclude much, and it is not clear to me that the counters are actually counting the same low-level transactions from one processor generation to the next. (This could be due to bug fixes in the counter events on newer processors, new bugs in the counter events on newer processors, or changed behavior due to low-level implementation changes in newer processors.)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page