I have an Intel Xeon server running 64 bit linux, on which I am trying to measure the cache miss overhead in the L1D cache. The code to do this is shown below.
The code uses a buffer[twice the L1D cache size] and bufindex[2 x associatity of L1D cache].
There is a function findsets, which, given an index into buffer, will find all the indices in buffer which map into the same set. These indices (call them m0, m1, m2, ....m15) are stored in bufindex.
There is also a function measureflush, which measures the time to access elements from buffer.
I make 4 calls to measureflush as follows. r = measureflush(0, 8); /* First Call measures the time to access m0 to m8*/ r ^= measureflush(0, 8); /* Second Call measures the time to access m0 to m8 */ r ^= measureflush(9, 16); /* Third Call measures the time to access m9 to m15*/ r ^= measureflush(0, 8); /* Fourth Call measures the time to access m0 to m8 */
I would expect that due the associativity, the second call to measureflush wouldtake lesser time than the the fourth. However this doesnot seem to be the case, as both calls take roughly the same time (which I thinkmeans that I am getting cache hits in both cases). A sample output is shown below.Where am I going wrong ? Is there something else I need to do in order to see the cache misses ? Any help in this regard will be really useful.