In the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B Sec 17.13.1 it says:
"On processors with invariant TSC support, the OS may use the TSC for wall clock timer services."
Does this formally imply that the TSC is always synchronized across all cores?
The 'Invariant TSC' means that the TSC runs at a fixed frequency and doesn't stop when the cpu halts.
The TSCs are not guaranteed to be synchronized although the OS usually does try to synchronize the TSC at boot time. This is one reason for the rdtscp instruction. On Nehalem and later cpus, the rdtscp instruction returns the TSC and an identifier indicating on which cpu you read the TSC. RDTSCP is a serializing instruction... unlike the regular rdtsc instruction.
Thank you for your reply Patrick. Could you please clarify: if it is established that the OS synchronizes the TSC at startup and that the system has an invariant TSC, so that the TSC's on different cores will not be skewed by different cores having different instances of SpeedStep/TurboBoost triggering, is there any way for the TSCs to get out of sync in the course of the system being up for a significant period fo time?
Presumably, the TSCs on a single CPU share internal resources. Even between CPUs, they are locked to a single time base, so they should not drift apart after being started by a shared reset signal.
People are still getting tied in knots over this, particularly on Windows, where gfortran is now in the middle of a system_clock strategy change (and ifort doesn't have consistent results for intervals less than 10 ms).
I'm expecting to go some rounds with an Intel expert on Ivy Bridge dual CPU next week. There may be a new round of questions to resolve about synchronization between CPUs, with faster RAM, .... No, I don't intend to try Windows on it.
MultiThreaded application measurement with TSC counters cannot be exactly precise because user mode code cannot be guaranteed to take control of the executing core for the period of sampling.More priviledged code mainly ISR and its DPC can preeempt user mode code in any moment and by doing this measurement inaccuracy can be large.
There is a possiblity to spawn kernel mode thread run it on cpu while the others logical processors are spinning in busy-wait loop at DPC level.So one can literally use the logical processor for performance sampling while other code is stalled.
Sergey Kostrov wrote:
>>...I'm expecting to go some rounds with an Intel expert on Ivy Bridge dual CPU next week...
It would be nice to hear results of your discussion. Thanks in advance.
As you pointed out, this took it well off topic.
is not BIOS vendor free to implement any functionality he wants in his code?
It's a difficult question, which I'm not in position to discuss in any depth. OEMs do have flexibility in their relationships with their BIOS writers, if they choose not to use whichever BIOS Intel has chosen for a similar platform. I believe there are BIOS writers' guides, which are very closely held, and of course I expect Intel to be passing requirements through OEMs when they are responsible for developing initial versions of products.