Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Confusion about RDPMC, RDMSR and addresses

Silvia
Beginner
3,447 Views

Hi,

I am confused about addresses for reading performance counters (Linux). Sorry if the questions are too basic, I'm fairly new to this! My CPU is an Intel Core i7-7500U (Kaby Lake). I am running it with hyper-threading disabled, and only one core. 

Program A, using RDMSR: I can read counters via a simple kernel module (written in c). I use inline assembly, I load ECX with address 30AH (this is IA32_FIXED_CTR1 which stores CPU_CLK_UNHALTED.CORE), then I use RDMSR, then I read EDX:EAX. This works.

Program B, using RDPMC: I want to read this counter as user, so I have set up the corresponding flag to allow for this by setting CR4.PCE = 1 (I do this indirectly by changing the flag created for perf events at /sys/devices/cpu/rdpmc to 2). After going through the forums I have found that the address for IA32_FIXED_CTR1 seems to be (1<<30) + 1, or 40000001H. This works. And here is where my confusion begins:

If I use RDPMC in program A and I try to read address 30AH, I get a general protection fault. If I try to read the address (1<<30)+1 with RDMSR in program A (kernel level) things go wrong too. Using RDPMC at user level with address 30AH doesn't work either. 

As I understand it  the instructions RDPMC and RDMSR are equivalent, only differences is their accessibility at user level. Section 2.1 Architectural SMRs of the Intel manual, vol.4 (page 2-3) reads "MSR address range between 40000000H - 400000FFH is marked as a specially reserved range". In the table that follows (2-2), the address to access cycles is marked as 30AH. I haven't found in the manual anything about referring to addresses  40000000H to 40000002H to access the fixed counters. My questions:

1) Why the different addresses and why do they work for one instruction and not the other?

2) Where can I find that I should use address 40000000H to read the fixed counter?

I suppose from these answers it will become clear why I cannot read the address 30AH with RDPMC at user level?

Many thanks!

Silvia.

0 Kudos
1 Solution
McCalpinJohn
Honored Contributor III
3,447 Views

RDMSR and RDPMC interpret their input argument completely differently.

  • RDMSR interprets its input argument as an MSR number.
  • RDPMC interprets its input argument as a performance counter number.

Performance counter numbers 0,1,2,3 are programmed using MSRs 0x186, 0x187, 0x188, 0x189, and their counts are available from MSRs 0xc1, 0xc2, 0xc3, 0xc4.

So reading the count for performance counter 0 using RDMSR would require an argument of 0xc1, while reading the count for performance counter 0 using RDPMC would use an argument of 0x0.

The "fixed-function" performance counters can be read at MSRs 0x309, 0x30a, 0x30b.  The RDPMC instruction allows access to these counters using the special performance counter numbers of 0x40000000, 0x40000001, 0x40000002.

View solution in original post

0 Kudos
6 Replies
McCalpinJohn
Honored Contributor III
3,448 Views

RDMSR and RDPMC interpret their input argument completely differently.

  • RDMSR interprets its input argument as an MSR number.
  • RDPMC interprets its input argument as a performance counter number.

Performance counter numbers 0,1,2,3 are programmed using MSRs 0x186, 0x187, 0x188, 0x189, and their counts are available from MSRs 0xc1, 0xc2, 0xc3, 0xc4.

So reading the count for performance counter 0 using RDMSR would require an argument of 0xc1, while reading the count for performance counter 0 using RDPMC would use an argument of 0x0.

The "fixed-function" performance counters can be read at MSRs 0x309, 0x30a, 0x30b.  The RDPMC instruction allows access to these counters using the special performance counter numbers of 0x40000000, 0x40000001, 0x40000002.

0 Kudos
Silvia
Beginner
3,447 Views

That's great, thanks a lot John. That answers it. 

0 Kudos
Silvia
Beginner
3,447 Views

Quick question to add: any idea of where this appears in the manuals by any chance?

I have searched a lot for it but couldn't find the info about the addresses of the fixed counters with RDPMC (before asking the question and after your reply I took another look but nothing). Just thinking it might be implied somewhere else hidden due to my lack of knowledge...It happens as I read more and more the manuals, much of the information I couldn't find initially later unveils, and many times after learning about seemingly unrelated topics. So I am wondering if this could be under a different topic/chapter that not necessarily a novice would relate to counters? Thanks again.

0 Kudos
Matthias_H_Intel
Employee
3,447 Views
0 Kudos
McCalpinJohn
Honored Contributor III
3,447 Views

This material is covered in a surprising variety of locations:

  • Volume 2 of the Intel Architectures Software Developer's Manual (document 325383-068)
    • This is the "Instruction Set Reference"
    • Important information is contained in the instruction descriptions for
      • CPUID
      • RDMSR
      • RDPMC
      • RDTSC
      • RDTSCP
      • WRMSR
  • Volume 3 of the Intel Architectures Software Developer's Manual (document 325384-068)
    • This is the "System Programming Guide"
    • Chapter 18 "Performance Monitoring", and
    • Chapter 19 "Performance Monitoring Events", and
  • Volume 4 of the Intel Architectures Software Developer's Manual (document 335592-068)
    • This contains lists of the "Model-Specific Registers" for the various processor models.
    • Volume 4 was split out from Volume 3 relatively recently, and the last time I checked there were still a fair number of broken references in Volume 4.  If you see a reference to a chapter number in Volume 4, it probably refers to that chapter in Volume 3.
  • The most up-to-date and comprehensive lists of events for the various processors seems to be at

Section 18.2 on "Architectural Performance Monitoring" introduces the concepts and capabilities, and is definitely the right place to start, but there are some details that are only listed in the instruction descriptions in Volume 2.

[Comments/Complaints:

  • One frustration is that the text is very inconsistent about providing the MSR numbers for the MSRs being discussed.  Sometimes they are included, but more often you need to look up the MSR name in Volume 4. 
    • This can be challenging because PDF searches don't typically work for words that are split across lines (which is very common in tables), and because many MSRs are referred to using variations on the name, e.g., starting the name with either "IA32_" or "MSR_" in different places. 
    • When doing PDF searches it is often challenging to be sure that you are finding results that are relevant to the processor that you are interested in. I often keep two copies of the document open -- one so I can see the table of contents and the section titles, and another with the search results.
    • Because of the challenges with searching PDFs, I often look up the MSR names in the C header files that I created for my performance-monitoring software -- e.g., the "MSR*.h" files in https://github.com/jdmccalpin/periodic-performance-counters)
  • To add to the fun, sometimes an MSR will change names between processor generations (same number, same functionality, different name), and sometimes an MSR will change functionality between processor generations (same number, same name, different functionality).  If I recall correctly, there was at least one time when an MSR changed numbers between generations (same name, same functionality, different number).

\Comments/Complaints]

0 Kudos
Silvia
Beginner
3,447 Views

@Mathias: thanks for the link, I will check it out.

@J. McCalpin: That's great, thank you very much for taking the time to list the resources, your help is much appreciated. I think there is a steep learning curve associated with the reading of the manuals themselves!

0 Kudos
Reply