Hi,
I was trying to calculate the time my application spends in the cache by using the published specs of 4 cycles in L1, 12 in L2, etc., along with the number of L1, L2 loads that my application makes. I realized that since sandy bridge can issue 2 loads in one cycle, my results were wonky.
My question is: how do I go about calculating the time spent in cache alone, given that there a parallel loads possible to L1?
Regards,
Chinnappa