Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Measuring LLC cache hit count using rdmsr and wrmsr

Jithin_Parayil_T_
797 Views

Hi,

I've been trying to figure out an efficient way of instrumenting certain specific array accesses in my program. For each of those array accesses, I need to determine whether it resulted in a request to the LLC and if so, whether it hit in the LLC or missed.

Example:

Consider the following program:

int main(int argv, char** argc) {

  // array is int* A

  for (int i = 0; i < N; i++) {

     sum += A // Need to instrument each such access separately

  }

}

In this regard, I've been trying to utilize the MSRs. My initial approach was to read/write to /dev/cpu/*/msr to configure the MSRs and get the counts.

However, this method does not seem to have the granularity required to instrument single memory accesses - because the code to read and write from /dev/cpu/*/msr itself seems to generate cache misses.

So, my current approach is to use the rdmsr and wrmsr instructions - by embedding them as asm code within my program. However, I'm hitting an error at the wrmsr instruction and I'm not sure how to debug it. Note that there is no error message - the program just halts at the point of wrmsr.

I'm running the code on a Nehalem processor (Intel(R) Xeon(R) CPU  X5550 )

So, following are the questions I have:

(i) Is my approach of using rdmsr and wrmsr to determine if a given array access was an LLC hit/miss a valid one?

(ii) Could you post some sample asm rdmsr and wrmsr instructions so that I could use them as a template. Maybe I'm not calling it the right way. This is my sample code:

hi=0; lo=0xb;

asm volatile("wrmsr"::"c"(0x38d),"a"(lo),"d"(hi)); // The program just halts at this point

 

Thanks in advance,

Jithin

 

0 Kudos
5 Replies
Bernard
Valued Contributor I
797 Views

You can not use privileged instruction like rdmsr and wrmsr  from user mode code(ring 3).

You must to write kernel driver to read MSR registers. 

http://wiki.osdev.org/Model_Specific_Registers

http://faydoc.tripod.com/cpu/rdmsr.htm

0 Kudos
Bernard
Valued Contributor I
797 Views

 

You can not use privileged instruction like rdmsr and wrmsr  from user mode code(ring 3).

You must  write kernel driver to read MSR registers. 

http://wiki.osdev.org/Model_Specific_Registers

http://faydoc.tripod.com/cpu/rdmsr.htm

 

0 Kudos
Bernard
Valued Contributor I
797 Views

@Jithin

Do you use Linux or Windows?

0 Kudos
Jithin_Parayil_T_
797 Views

I'm using Linux.

Going back to my original question: Is it a valid approach to use rdmsr and wrmsr to determine if a given array access was an LLC hit/miss?

Also, am I correct in concluding /dev/cpu/*/msr cannot be used to instrument cache usage at the granularity of individual array accesses?

Thanks.

0 Kudos
Bernard
Valued Contributor I
797 Views

 

>>>Is it a valid approach to use rdmsr and wrmsr to determine if a given array access was an LLC hit/miss?>>>

I think that this the approach taken by various profiling tools like VTune. Unfortunately I cannot find the latency of both wrmsr and rdmsr instructions. I suppose that MSR reading/writing instructions latency should be taken into account. When that latency in CPU cycles is longer than load/store uops execution time perhaps few LLC hits/miss cannot be read because rdmsr/wrmsr are  decomposed into uop(s) which in turn are at various stages of execution at hardware level. Precise reading of MSR counters depends also on the latency needed to update the specific counter on some event occurrence.

>>>Also, am I correct in concluding /dev/cpu/*/msr cannot be used to instrument cache usage at the granularity of individual array accesses?>>>

Sorry I do not know.

0 Kudos
Reply