I am trying to figure out the amount of memory bus traffic in
From http://assets.devx.com/goparallel/18027.pdf I thought that
BUS_TRAN_BURST.SELF (multiplied by 64) would bea good measure.
I also expected that this number would be within
2x of MEM_LOAD_RETIRED.L2_LINE_MISS (there are no RFOs
etc.). However, I see that BUS_TRAN_BURST.SELF is ~4 to 5 ofx
MEM_LOAD_RETIRED.L2_LINE_MISS. I have been trying to figure
out where thedifference comes from but I have not found a reasonable
I also measuredL2_LD.SELF.DEMAND.MESI andL2_LD.SELF.ANY.MESI
and found that L2_LD.SELF.DEMAND.MESI is about half of
L2_LD.SELF.ANY.MESI and that L2_LD.SELF.DEMAND.MESI is about
double of BUS_TRAN_BURST.SELF.
The number of L2_M_LINES_OUT.SELF.ANY events is about 1.5 of
the number of MEM_LOAD_RETIRED.L2_LINE_MISS events.
Any help would be greatly appreciated.
So do the prefetchers kick in for your application? What does L2_LD.SELF.PREFETCH.MESI report? Have you tried the experiments with both L2 prefetchers disabled?
Thank you very much for your suggestion.
I measured L2_LD.SELF.PREFETCH.MESI and L2_LD.SELF.PREFETCH.I_STATE.
Here is a table with the event counts in (GEvents 10^9):
So it does seem that prefetch is quite active. Does it make sense to say that
the L2_LD.PREFETCH.I_STATE events cause a similar number of cache
I cannot easily disable the prefetchers on the bios as this is running on
a remote server. Is there a way to programatically disable prefetching?
Thank you very much for your post.I will try using http://etallen.com/msr.htmlto set the proper
MSR bits and will report the profile values after I repeat the experiments.I am not looking to
disable the prefetchers for performance but merely to see how much of the prefetching is wasted.
The application I am profiling is quite large has some parts where the access patterns are very "random"
and thoseparts are causing huge numbers of cache misses and bus traffic, even with only one thread.
Gathering this data should help push for a change and will help in our optimization efforts.
Unfortunately running the msr tool from http://etallen.com/msr.htmlis not working.
On the machines I have access to. I get
[root@...]# ./msr IA32_MISC_ENABLE.aclp_dis=1
msr: info: IA32_MISC_ENABLE.aclp_dis=1: fell back to numeric interpretation
msr: unable to write msr file at offset 0x000001a0; errno = 9 (Bad file descriptor)
(BTW the MSR module is compiled into the kernel).
I also tried just using wrmsr from asm/msr.h but that just segfaults.
I the tried wrmsr from a stap script ... and killed the machine ugh.
Can you point me to the appropriate way of manipulating MSR?