Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

How to get L1, L2 and L3 cache misses by reading performance counters using rdpmc instruction?

Shailja_P_
Beginner
1,703 Views

For example, sample code looks like this

long long get_L3_misses( )
{
  unsigned int a=0, d=0, c;

  c = (1<<30); // what is counter number for L3 cache misses?
  asm volatile(
        "rdpmc"
            : "=a" (a), "=d" (d)
            : "c" (c)
        );

  return ((long long)a) | (((long long)d) << 32);

}

int main(int argc, char* argv[])

{

 long long start, stop;

 double result;

  start = get_L3_misses();

  some funtioncall;

  stop = get_L3_misses;

  result = (double) stop - start;

  return 0;

 

 

Thank you.

0 Kudos
2 Replies
Thomas_W_Intel
Employee
1,703 Views

Shailja,

the counters need to be programmed first with the event that they should count. Sample code for doing so can be found here:

https://github.com/opcm/pcm/blob/c21fbce6af8fb2435d390a56c7db75191d1df34f/cpucounters.cpp#L1717

The counter is then read later on in this location:

https://github.com/opcm/pcm/blob/c21fbce6af8fb2435d390a56c7db75191d1df34f/cpucounters.cpp#L2991

In case you are not interested in the details but just want to get the number of cache misses, you might consider using the PCM library as is and simply use the high-level functions. An example for calling the library can be found here:

https://software.intel.com/en-us/articles/intel-performance-counter-monitor#calling_pcm

Kind regards

Thomas

 

0 Kudos
Shailja_P_
Beginner
1,703 Views

Thank you Thomas for the reply. Really helpful.

0 Kudos
Reply