Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

TSC Synchronization Across Cores

Samuel_M_1
新分销商 I
9,519 次查看

In the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B Sec 17.13.1 it says: 

"On processors with invariant TSC support, the OS may use the TSC for wall clock timer services." 

Does this formally imply that the TSC is always synchronized across all cores?

0 项奖励
18 回复数
Patrick_F_Intel1
9,519 次查看

Hello Samuel,

The 'Invariant TSC' means that the TSC runs at a fixed frequency and doesn't stop when the cpu halts.

The TSCs are not guaranteed to be synchronized although the OS usually does try to synchronize the TSC at boot time. This is one reason for the rdtscp instruction. On Nehalem and later cpus, the rdtscp instruction returns the TSC and an identifier indicating on which cpu you read the TSC. RDTSCP is a serializing instruction... unlike the regular rdtsc instruction.

Pat

0 项奖励
Samuel_M_1
新分销商 I
9,519 次查看

Thank you for your reply Patrick. Could you please clarify: if it is established that the OS synchronizes the TSC at startup and that the system has an invariant TSC, so that the TSC's on different cores will not be skewed by different cores having different instances of SpeedStep/TurboBoost triggering, is there any way for the TSCs to get out of sync in the course of the system being up for a significant period fo time?

0 项奖励
Patrick_F_Intel1
9,520 次查看

An Invariant TSC won't vary with Speedstep nor TurboBoost. There shouldn't be any way for the TSCs to get out sync but I've never actually checked this.

0 项奖励
SergeyKostrov
重要分销商 II
9,520 次查看
The smallest difference in TSC values for two different cores I've measured was 708 nano-seconds. It should be less for fast 3rd generation CPUs, like Ivy Bridge, etc, and take a look at: Forum Topic: Synchronizing Time Stamp Counter Web-link: software.intel.com/en-us/forums/topic/332570 Note: A test case is attached to my post dated on Tue, 11/06/2012 - 06:49
0 项奖励
TimP
名誉分销商 III
9,520 次查看

Presumably, the TSCs on a single CPU share internal resources.  Even between CPUs, they are locked to a single time base, so they should not drift apart after being started by a shared reset signal.

People are still getting tied in knots over this, particularly on Windows, where gfortran is now in the middle of a system_clock strategy change (and ifort doesn't have consistent results for intervals less than 10 ms).

I'm expecting to go some rounds with an Intel expert on Ivy Bridge dual CPU next week.  There may be a new round of questions to resolve about synchronization between CPUs, with faster RAM, ....  No, I don't intend to try Windows on it.

0 项奖励
SergeyKostrov
重要分销商 II
9,520 次查看
Duplicate - deleted. There are strange performance issues today!
0 项奖励
SergeyKostrov
重要分销商 II
9,520 次查看
>>...Does this formally imply that the TSC is always synchronized across all cores? Every computer system has one Reset signal and how is it possible to have different TSC values for different cores? What I see is just measurements errors and in a multi-threaded environment it is impossible to accurately measure all TSC values for all CPUs at the same time.
0 项奖励
Bernard
重要分销商 I
9,520 次查看

MultiThreaded application measurement with TSC counters cannot be exactly precise because user mode code cannot  be guaranteed to take control of the executing core for the period of sampling.More priviledged code mainly ISR and its DPC can preeempt user mode code in any moment and by doing this measurement inaccuracy can be large.

0 项奖励
SergeyKostrov
重要分销商 II
9,520 次查看
>>...I'm expecting to go some rounds with an Intel expert on Ivy Bridge dual CPU next week... It would be nice to hear results of your discussion. Thanks in advance.
0 项奖励
Bernard
重要分销商 I
9,520 次查看

There is a possiblity to spawn kernel mode thread run it on cpu while the others logical processors are spinning in busy-wait loop at DPC level.So one can literally use the logical processor for performance sampling while other code is stalled.

0 项奖励
SergeyKostrov
重要分销商 II
9,519 次查看
>>...Does this formally imply that the TSC is always synchronized across all cores? Samuel, try to imaging if they are Not synchronized and what implications it would create on all the rest hardware subsystems.
0 项奖励
Samuel_M_1
新分销商 I
9,519 次查看

>> Samuel, try to imaging if they are Not synchronized and what implications it would create on all the rest hardware subsystems.

Could you please elaborate?

0 项奖励
TimP
名誉分销商 III
9,519 次查看

Sergey Kostrov wrote:

>>...I'm expecting to go some rounds with an Intel expert on Ivy Bridge dual CPU next week...

It would be nice to hear results of your discussion. Thanks in advance.

As you pointed out, this took it well off topic.

0 项奖励
Bernard
重要分销商 I
9,519 次查看

Hi Tim,

is not BIOS vendor free to implement any functionality he wants in his code?

0 项奖励
SergeyKostrov
重要分销商 II
9,519 次查看
I wonder how Tim's response is relevant to the subject of the thread? I don't think it answers the original question or provides more technical details. Thanks anyway.
0 项奖励
TimP
名誉分销商 III
9,519 次查看

iliyapolak wrote:

Hi Tim,

is not BIOS vendor free to implement any functionality he wants in his code?

It's a difficult question, which I'm not in position to discuss in any depth.  OEMs do have flexibility in their relationships with their BIOS writers, if they choose not to use whichever BIOS Intel has chosen for a similar platform.  I believe there are BIOS writers' guides, which are very closely held, and of course I expect Intel to be passing requirements through OEMs when they are responsible for developing initial versions of products.

0 项奖励
Bernard
重要分销商 I
9,519 次查看

Tim thanks for the answer.

0 项奖励
SergeyKostrov
重要分销商 II
9,519 次查看
>>In the Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 3B Sec 17.13.1 it says: >> >>"On processors with invariant TSC support, the OS may use the TSC for wall clock timer services." >> >>Does this formally imply that the TSC is always synchronized across all cores? By default the TSC is synchronized across all cores. However, the TSC value of a core could be changed by some software subsystem using the WRMSR instruction. Take a look at quotes below and I hope they finally answer your question: Intel(R) 64 and IA-32 Architectures Software Developer’s Manual Volume 3 (3A, 3B & 3C): System Programming Guide Order Number: 325384-044US August 2012 Page 571 17.13 TIME-STAMP COUNTER ... Constant TSC behavior ensures that the duration of each clock tick is uniform and supports the use of the TSC as a wall clock timer even if the processor core changes frequency. ... Page 572 17.13.3 Time-Stamp Counter Adjustment ... Software can modify the value of the time-stamp counter (TSC) of a logical processor by using the WRMSR instruction to write to the IA32_TIME_STAMP_COUNTER MSR (address 10H). Because such a write applies only to that logical processor, software seeking to synchronize the TSC values of multiple logical processors must perform these writes on each logical processor. It may be difficult for software to do this in a way than ensures that all logical processors will have the same value for the TSC at a given point in time. The synchronization of TSC adjustment can be simplified by using the 64-bit IA32_TSC_ADJUST MSR ( address 3BH ). Like the IA32_TIME_STAMP_COUNTER MSR, the IA32_TSC_ADJUST MSR is maintained separately for each logical processor. ...
0 项奖励
回复