Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
1093 Discussions

How to accurately measure CPU frequency in user-mode code?

levicki
Valued Contributor I
1,608 Views
With all the performance counters and MSRs available in Intel CPUs, it still seems impossible to accurately measure CPU frequency in user-mode code:
[bash]#include #include #include #pragma comment(lib, "winmm.lib") typedef unsigned __int64 u64; __declspec(naked) u64 readtsc(void) { __asm { rdtsc ret } } double GetCPUFrequency(void) { static u64 r0 = 0, f0 = 0; u64 r1, r2; DWORD t0, t1; int i; timeBeginPeriod(1); for (i = 0; i < 3; i++) { t0 = timeGetTime(); t1 = t0; while (t1 - t0 < 20) { t1 = timeGetTime(); r1 = readtsc(); } t0 = t1; while (t1 - t0 < 40) { t1 = timeGetTime(); r2 = readtsc(); } r0 += r2 - r1; f0 += (t1 - t0); } timeEndPeriod(1); return (r0 / f0) / 1000.0; } int main(int argc, char* argv[]) { for (;;) { printf("Frequency : %.2f MHz\r", GetCPUFrequency()); Sleep(1000); } return 0; } [/bash] The above code returns 3400 MHz for a Core i7 2600K overclocked to 4000 MHz. Tools such as CPU-Z and AIDA64 are capable of measuring frequency accurately but they use device drivers to execute ring 0 code which has access to MSRs.

My question is why Intel CPU engineers did not provide user-mode instructions (kind of like RDTSC/RDTSCP) for APERF and MPERF MSRs, but instead left those accessible only from ring 0? What were they thinking?

Finally, is there any way to work around this?
0 Kudos
19 Replies
Bernard
Valued Contributor I
1,608 Views
Maybe you have inaccurate measurements because of context switching and thread blocking induced by operating system.Try to increase priority to above normal and set affinity mask to preffered cpu
0 Kudos
levicki
Valued Contributor I
1,608 Views
No, that is not the reason.

Measured frequency (3400 MHz) is the default CPU frequency (34x is default multiplier).

CPU is overclocked to 4000 MHz by setting its turbo mutliplier to 40x in BIOS.

What I measure is probably affected with powersaving (SpeedStep, etc).
0 Kudos
Bernard
Valued Contributor I
1,608 Views
Can you write simple driver which contains only driver entry function and add device function to access msr registers with inline assembly afaik you can pass the msr value to user mode code via irp.
0 Kudos
levicki
Valued Contributor I
1,608 Views
Yes I can, but driver needs to be digitally signed to load on Windows 7 x64 and I do not have a certificate because I can't buy a cheap one in Serbia and those I can buy are too expensive.

Yes, I can use open-source signed driver too, but that is not a solution to the main problem -- inability to find out current CPU frequency in user mode. That is a problem plaguing Windows and Linux user mode applications and nothing smart has been done so far to resolve it.

0 Kudos
Bernard
Valued Contributor I
1,608 Views
Patchguard is blocking unsigned driver installation on 64 bit machine.Afaik this protection has been broken and there is option to disable patchguard.Read uninformed.org site they have info on disabling patchguard.Also try this software Driver Signature Enforcement Overrider it seems that it tool will help you to install unsigned drivers.
0 Kudos
levicki
Valued Contributor I
1,608 Views
That is cool, but I can do that only on my computer, not customer's :)
0 Kudos
Bernard
Valued Contributor I
1,608 Views
Can you install "Driver Signature Enforcement Overrider" on customer's machine?
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,608 Views
Igor,

Have you verified that timeGetTime is actually returning milliseconds since boot on overclocked system?
(IOW is it assuming non-overclocked system)

Jim Dempsey
0 Kudos
levicki
Valued Contributor I
1,608 Views
Jim, that is not relevant because I do not use absolute time, just the period and AFAIL timeGetTime() is returning monotonically incrementing time.

The only way I know to get frequency is to read APIC and current multiplier which both require kernel mode code.

In my opinion Intel should have created a way for user programs to get this information. Perhaps it is time to ask for CPUFREQ instruction to be added to x86 ISA.
0 Kudos
Bernard
Valued Contributor I
1,608 Views
Or we can patch microcode andset rdmsrcpl to 3. :)
0 Kudos
levicki
Valued Contributor I
1,608 Views
That still wouldn't solve the problem of reading BCLK.
0 Kudos
Bernard
Valued Contributor I
1,608 Views
I think that the only solution for now is to write simple driver read the msr and send the values to userland app.RegardingWindows 7 64-bit you can try to usethe software mentioned by me in my previous posts.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,608 Views
>> that is not relevant because I do not use absolute time, just the period and AFAIL timeGetTime() is returning monotonically incrementing time.

return (r0 / f0) / 1000.0;

Where:

r0 = 3-sum (delta rdtsc over 40-40+ ms interval as returned by timeGetTime())
f0 = 3-sum (delta timeGetTime() over 40-40+ ms interval)

In the event that timeGetTime() is returning counts at the (overclocked) accelerated rate, then the change in the ratio of r0 / f0 will not be observed.

You realy want f0 = 120ms of wall clock time.

Without seeing the code for timeGetTime() it is not conclusive as to if the ms is "virtual ms" under the assumption of a fixed clock rate that is also not overclocked.

My i7 2600K system is not overclocked, but it would be a relatively easy test for you to make on your overclocked system to assert that 10,000 ms as reported by timeGetTime() == 10 seconds of wall clock time.

Jim Dempsey
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,608 Views
By the way,

On Windows 7 x64 I looked at QueryPerformanceFrequency on my system and it returns 3,331,259 ticks/second. I thought that this would be the FSB frequency. On my older Q6600 XP x64 it was the FSB frequency. Since the system has Turbo Boost (or I prefer to call it overheat protection slow-down), I cannot say if the FSB fluctuates with the Turbo-Boost, so the system may have a different means of comming up with a (somewhat) constant high frequency precision frequency.

You could use a ratio of your RDTSC vs QueryPerformanceFrequency without using a driver (at least on Windows).

Jim Dempsey
0 Kudos
levicki
Valued Contributor I
1,608 Views
Jim,

timeGetTime() is windows multimedia timer API so my bet is that it is pretty fixed.

Turbo boost does not affect FSB, it affects multiplier.

There is no FSB in i7, there is BCLK and it is 100 MHz.

QueryPerformanceFrequency() most likely returns HPET timer ticks.

You are free to experiment with that code (and overclocking and power management), I'd be gratefull if you can make it work :)
0 Kudos
andysem
New Contributor III
1,608 Views
I think, public OS APIs are the best option to do this. I didn't check but on Windows you can try using WMI and on Linux /proc/cpuinfo seem to report the current CPU frequency. You can also see if/sys/devices/system/cpu/cpu0/cpufreq/* suits you, although on my machine reading these files require root priveleges.
0 Kudos
Bernard
Valued Contributor I
1,608 Views

Igor!
As andysem wrote in his post you can use WMI to obtain accurate info aboute the CPU freq.

0 Kudos
levicki
Valued Contributor I
1,608 Views
I just tested this but WMI also reports incorrect frequency -- it is not aware of Turbo Boost multiplier set to 40x and it is showing 3400 MHz instead of 4000 MHz.
0 Kudos
maratyszcza
Beginner
1,608 Views
New CPUs have "constant timestamp counter frequency" feature. This means that the timer which is queried by rdtsc instruction doesn't change its frequency when CPU cores are overclocked or downlocked by turboboost. It also means that you can not detect current CPU frequency by comparing rdtsc progress to HPET progress.
0 Kudos
Reply