I am using IA32_FIXED_CTR1 to read unhalted core cycles. CPUID leaf 0x0a tells me that these have width of 48 bits. I run rdmpc with exc = 0x40000001 to specify this fixed counter, and then grab eax and the first 16 bits of edx to get the cycle count.
However, just after system startup the value of edx is 65535, or 1111111111111111b. This seems incorrect; shouldn't it take a long time to count up to that value if it starts at zero? Additionally, multiple invocations of the program that reads the cycles (C++ source file attached) show that the lower bits of the count cycle in a matter of seconds - that is, running the program a second time produces a lower cycle count than the first. This shouldn't happen for a 48-bit counter if I am not mistaken.
I appreciate any advice as to why this is happening.
I am using an Intel Xeon x5650 processor with Ubuntu 12.04 LTS.
A couple of things. You need to check that the fixed counters aren't being used by the OS.
Check if /proc/sys/kernel/nmi_watchdog is == 1. If so, the OS is using the counters. You can disable the counters with
echo 0 > /proc/sys/kernel/nmi_watchdog
and check if it is now disabled with
Next you need to enable the fixed counters and start the fixed counters. Do you know how to do that?
Then, you need to check whether you are actually able to execute the rdpmc instruction. Most OS's don't allow rdpmc in ring3. But it seems like you can (since your program didn't crash).
Then (not being one to trust counters too much) I'd pin my thread to 1 cpu, use fixed_ctr2 (the reference clockticks), read the TSC, read the current value counter value, do a loop (not sleeping) for 10 seconds, then read the counter value and TSC again and take the difference in the TSC and counter values. The diff in the TSC and counter should be about the same. This will verify that you have the counters working properly.
If you have the msr-tools-1.2 installed, you might want to check the programming of the fixed-function counters -- MSR 38DH controls the fixed-function counters. Bits 3, 7, and 11 control whether performance monitor interrupts are generated on the overflow of the counters (instructions retired, cpu cycles not halted, and reference cycles not halted, respectively). If any of these bits are set that very strongly implies that some other process (such as the NMI watchdog) is using that counter and reseting the values to 2^48 minus the desired interrupt interval.
I don't know why, but on some of our systems the NMI watchdog uses the fixed-function counter and on some systems it uses one of the programmable counters. It is something of a pain in either case, but if the watchdog uses the fixed-function counter, then the corresponding programmable performance counter is free.
An alternative set of counters available through the MSR interface are at 0xE7 (IA32_MPERF) and 0xE8 (IA32_APERF). These measure the same "Cycles Not Halted" as fixed-function counters 0x30B (IA32_FIXED_CTR2) and 0x30A (IA32_FIXED_CTR1), respectively, except that the counters rollover differently. (If I am reading the description in V3 of the SWDM correctly, IA32_MPERF is reset when IA32_APERF overflows, and vice-versa.)