- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
main:
rdtsc
mov esi, eax
rdtsc
sub eax, esi
; eax now has time difference between the first and second rdtsc
I'm curious about why the instruction latency is as long as 100 cycles on P4 and 60 on Xeon 51xx architectures. If it is true that RDTSC is not serializing (as mentioned in the Intel Software Developer's Manual), why should this take that long? Some potential explanations I gleamed from reading the other posts is that this might be due to:
(1) long sequence of microops of RDTSC
http://software.intel.com/en-us/forums//topic/52330
(2) resolution being limited by bus speed
http://software.intel.com/en-us/forums//topic/52330
(3) "synchronization of pipeline" But I thought the manual expressely said that was not happening?
http://software.intel.com/en-us/forums//topic/52482
Any thoughts as to which is the dominant? Perhaps these are not the real reasons? I'd appreciate any help on this.
Thanks in advance...
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We forwarded this question to several of our engineering contacts. One replied:
rdtsc does not have this sort of granularity.
For best results, make sure you have at least ~1,000 clocks worth of instructions between consecutive rdtsc calls.
The refs cited below are accurate -- i.e. rdtsc is multiple uops, it does not serialize the machine, but it is serializing with respect to itself. the resolution on newer machines (including those with Core2 Duo Processors) is a multiple of bus clocks as well, just as you noted.
All that said, you will have more repeatable results if you have a large number of clocks between successive calls -- I recommend somewhere in the neighborhood of at least 1000.
Another nitpick point: the timestamp counter is a 64-bit quantity. Subtracting only the lower 32 bits is not a safe technique. The code given can occasionally give wild and incorrect results when the counts wrap past 2^32.
==
Lexi S.
IntelSoftware NetworkSupport

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page