- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
17.12.1 Invariant TSC
The time stamp counter in newer processors may support an enhancement, referred
to as invariant TSC. Processors support for invariant TSC is indicated by
CPUID.80000007H:EDX[8].
The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is
the architectural behavior moving forward. On processors with invariant TSC
support, the OS may use the TSC for wall clock timer services (instead of ACPI or
HPET timers). TSC reads are much more efficient and do not incur the overhead
associated with a ring transition or access to a platform resource.
and more available at: http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-manual-325462-rmver.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[bash] Advanced Power Management Features (0x80000007/edx): temperature sensing diode = false frequency ID (FID) control = false voltage ID (VID) control = false thermal trip (TTP) = false thermal monitor = false software thermal control (STC) = false 100 MHz multiplier control = false hardware P-State control = false TscInvariant = true [/bash]Best regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, has anyone been able to used a cpuid utility to determine whether invariant TSC is enabled on windows?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does Invariant TSC this mean it TSC will be constant across multiple sockets.
My system is
model name : Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
But I still see TSC variations across sockets.
Any idea???
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I suppose that TSC reading is pinned to single socket.
Btw read @Martin Dixon answer particularly the last sentence.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For newer processors (at least since Nehalem) there is also an RDTSCP instruction that returns the time stamp counter plus the contents of an additional core-specific register. Linux systems (at least since 2.6.24 or so?) set this additional register to contain the socket number and the core number of the processor core that executed the RDTSCP instruction. The hardware guarantees that the value read from the TSC and the value read from this additional register are done atomically, so it guarantees that you know exactly which core provided the TSC value.
The RDTSCP instruction reads the same TSC as the RDTSC instruction, so if RDTSC is invariant, then RDTSCP will be as well.
RDTSCP is slightly more ordered than RDTSC. RDTSC is not ordered at all, which means that it will execute some time in the out-of-order window of the processor, which may be before or after the instruction(s) that you are interested in timing. RDTSCP will not execute until all prior instructions (in program order) have executed. So it can't execute "early", but there is no guarantee that the execution won't be delayed until after some subsequent (in program order) instructions have executed. In practice I have never seen a problem with hardware reordering of either of these instructions -- most processors tend to execute in FIFO order most of the time, and since these instructions have no input dependencies they tend to get executed pretty close to where they sit in program order. It is hard to tell how long they really require to execute because they are designed to provide monotonically increasing values. The (invariant) TSC is incremented by the base multiplier once every reference clock, so on my Xeon E5-2680 (Sandy Bridge EP) this is an increment of 27 every 10 ns. The only way to avoid getting the same result (which would result in a time difference of zero) is to make sure that the instruction takes at least 10 ns to execute. This is 27 cycles at 2.7 GHz and 31 cycles at the Turbo speed of 3.1 GHz. In practice it takes a few more cycles for RDTSCP since it returns an extra value, and extra cycles are required to store the results.
RDTSCP is also by far the easiest way to determine which socket and which core a process is running on, since most systems allow user-mode execution of the RDTSCP instruction. You can stick it in a simple inline assembler macro and get the TSC, the processor number, and the socket number with an overhead of O(50) cycles.
The version I use is: [cpp]
unsigned long tacc_rdtscp(int *chip, int *core)
{
unsigned long int x;
unsigned a, d, c;
__asm__ volatile("rdtscp" : "=a" (a), "=d" (d), "=c" (c));
*chip = (c & 0xFFF000)>>12;
*core = c & 0xFFF;
return ((unsigned long)a) | (((unsigned long)d) << 32);;
}
[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John D. McCalpin wrote:
The version I use is:
unsigned long tacc_rdtscp(int *chip, int *core) { unsigned long int x; unsigned a, d, c; __asm__ volatile("rdtscp" : "=a" (a), "=d" (d), "=c" (c)); *chip = (c & 0xFFF000)>>12; *core = c & 0xFFF; return ((unsigned long)a) | (((unsigned long)d) << 32);; }
with the Intel compiler one can simply use the __rdtscp intrinsic for this purpose
Synopsis
#include "immintrin.h"
Instruction: rdtscp
CPUID Flag : RDTSCP
Description
Operation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler directive is fine, of course, but I built my own version so that it would decode the socket and core information that Linux puts in the auxiliary register without me needing to remember how the bits are packed.
I have no idea if Windows puts anything in the TSC_AUX MSRs.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page