***** Latency of RDTSC and RDTSCP instructions on Intel CPUs *****

**[ Abstract ]**Intel CPU's a Time Stamp Counter ( TSC ) is a special 64-bit register that increments every clock cycle. Two instructions, RDTSC and RDTSCP, could read a value of TSC into General Purpose Registers ( GPR ). Intel doesn't provide any information on latencies of these two instructions, however througputs for both instructions are given in Intel 64 and IA-32 Architectures Optimization Reference Manual.

**[ List of Abbreviations ]**CPU - Central Processing Unit TSC - Time Stamp Counter ( number of clock cycles since the CPU is powered on ) GPR - General Purpose Registers ATV - Absolute TSC Value DTV - Difference TSC Value

**[ Computer Systems used for evaluations ]**

**** Dell Precision Mobile M4700 ****Intel Core i7-3840QM ( 2.80 GHz ) Ivy Bridge / 4 cores / 8 logical CPUs / ark.intel.com/products/70846 32GB RAM 320GB HDD NVIDIA Quadro K1000M ( 192 CUDA cores / 2GB memory ) Windows 7 Professional 64-bit SP1 Size of L3 Cache = 8MB ( shared between all cores for data & instructions ) Size of L2 Cache = 1MB ( 256KB per core / shared for data & instructions ) Size of L1 Cache = 256KB ( 32KB per core for data & 32KB per core for instructions ) Display resolution: 1366 x 768

**** Dell Dimension 4400 ****Intel Pentium 4 ( 1.60 GHz / 1 core ) 1GB RAM Seagate 20GB HDD ( * ) Seagate 3TB HDD ( ** ) EVGA GeForce 6200 Video Card 512MB DDR2 AGP 8x Video Card Windows XP Professional 32-bit SP3 Size of L2 Cache = 256KB Size of L1 Cache = 8KB Display resolution: 1440 x 990 ( * ) Seagate Barracuda 20GB IDE Hard Disk Drive ST320011A 3.5" 7200 Rpm 2MB Cache IDE Ultra ATA100 / ATA-iV/6 Average Rotational Latency : 4.17 ms Average Seek Times Read : 9.0ms Average Seek Times Write : 10.0ms Maximum Internal Transfer Rate : 69.4MB/sec Average External Transfer Rate : 100MB/sec ( Read and Write ) Maximum External Transfer Rate : 150MB/sec ( Read ) Note: Barracuda ATA IV Family ( ** ) Seagate Barracuda 3TB IDE Hard Disk Drive ST3000DM001 3.5" 7200 Rpm 64MB Cache SATA III ( 6GB/sec ) Average Rotational Latency : 4.16 ms Average Seek Times Read : 8.5ms Average Seek Times Write : 9.5ms Maximum Internal Transfer Rate : 268MB/sec Average External Transfer Rate : 156MB/sec ( Read and Write ) Maximum External Transfer Rate : 210MB/sec ( Read )

**[ List of tests ]**Four tests are completed for every CPU tested with different C++ compilers:

**[ Sub-Test002.01.A - RDTSC ]**- pure C language

**[ Sub-Test002.01.B - RDTSC ]**- C language with inline assembler

**[ Sub-Test002.01.C - RDTSCP ]**- pure C language

**[ Sub-Test002.01.D - RDTSCP ]**- C language with inline assembler

**[ CPU: Pentium 4 - Microsoft C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Started TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles Latency of 'MOV ecx, eax' instruction is 1 clock cycle(s) [ Sub-Test002.01.B - RDTSC ] - Completed [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Pentium 4 - Borland C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.40 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Pentium 4 - Intel C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 81.20 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Started TSC Minimal Averaged Delta is 80.30 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles TSC Minimal Averaged Delta is 79.90 clock cycles Latency of 'MOV ecx, eax' instruction is 1 clock cycle(s) [ Sub-Test002.01.B - RDTSC ] - Completed [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Pentium 4 - MinGW C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Pentium 4 - Watcom C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 80.40 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles TSC Minimal Averaged Delta is 80.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Microsoft C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.80 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 27.40 clock cycles TSC Minimal Averaged Delta is 28.20 clock cycles TSC Minimal Averaged Delta is 26.60 clock cycles TSC Minimal Averaged Delta is 28.20 clock cycles TSC Minimal Averaged Delta is 26.60 clock cycles TSC Minimal Averaged Delta is 28.60 clock cycles TSC Minimal Averaged Delta is 28.20 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Started TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 27.50 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles Latency of 'MOV ecx, eax' instruction is 1 clock cycle(s) [ Sub-Test002.01.B - RDTSC ] - Completed [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Microsoft C++ compiler - 64-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 26.60 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 25.80 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 25.80 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Borland C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 25.80 clock cycles TSC Minimal Averaged Delta is 28.30 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Borland C++ compiler - 64-bit ]**[ Sub-Test002.01.A - RDTSC ] - Not Supported [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Intel C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 29.00 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 32.60 clock cycles TSC Minimal Averaged Delta is 29.60 clock cycles TSC Minimal Averaged Delta is 28.60 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 37.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 28.20 clock cycles TSC Minimal Averaged Delta is 25.80 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Started TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 26.70 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles TSC Minimal Averaged Delta is 27.10 clock cycles Latency of 'MOV ecx, eax' instruction is 1 clock cycle(s) [ Sub-Test002.01.B - RDTSC ] - Completed [ Sub-Test002.01.C - RDTSCP ] - Started TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 34.20 clock cycles TSC Minimal Averaged Delta is 34.20 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 34.20 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 34.20 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles [ Sub-Test002.01.C - RDTSCP ] - Completed [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Intel C++ compiler - 64-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Started TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 33.40 clock cycles TSC Minimal Averaged Delta is 33.80 clock cycles [ Sub-Test002.01.C - RDTSCP ] - Completed [ Sub-Test002.01.D - RDTSCP ] - Started TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.30 clock cycles TSC Minimal Averaged Delta is 34.70 clock cycles TSC Minimal Averaged Delta is 34.30 clock cycles Latency of 'MOV rcx, rax' instruction is 1 clock cycle(s) [ Sub-Test002.01.D - RDTSCP ] - Completed

**[ CPU: Ivy Bridge - MinGW C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Not Supported [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported -

**[ CPU: Ivy Bridge - MinGW C++ compiler - 64-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 28.20 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Watcom C++ compiler - 32-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 25.00 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 24.60 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 26.20 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**[ CPU: Ivy Bridge - Watcom C++ compiler - 64-bit ]**[ Sub-Test002.01.A - RDTSC ] - Started TSC Minimal Averaged Delta is 26.60 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles TSC Minimal Averaged Delta is 25.40 clock cycles TSC Minimal Averaged Delta is 27.00 clock cycles [ Sub-Test002.01.A - RDTSC ] - Completed [ Sub-Test002.01.B - RDTSC ] - Not Supported [ Sub-Test002.01.C - RDTSCP ] - Not Supported [ Sub-Test002.01.D - RDTSCP ] - Not Supported

**__rdtscp**intrinsic function need to be considered. The function is declared as follows: ... extern unsigned __int64 __ICL_INTRINCC

**__rdtscp**( unsigned int * ); ... Note: Let's denote uiTscValue as

**1st value**, and iRetValue as

**2nd value**.

**Use Case 1**- 1st value used / 2nd value used: ... unsigned int iRetValue = 0; unsigned __int64 uiTscValue =

**__rdtscp**( &iRetValue ); ... C++ compiler should generate ordered MOV instructions to save 1st value and 2nd value at some addresses.

**Use Case 2**- 1st value used / 2nd value not used: ... unsigned __int64 uiTscValue =

**__rdtscp**( NULL ); ... C++ compiler should not generate MOV instructions to save 2nd value at NULL address. Currently, Intel C++ compiler tries to save 2nd value to NULL address and Access Violation exception is generated.

**Use Case 3**- 1st value not used / 2nd value used: ... unsigned int iRetValue = 0;

**__rdtscp**( &iRetValue ); ... C++ compiler should not generate MOV instructions to save 1st value at some address.

**Use Case 4**- 1st value not used / 2nd value not used: ...

**__rdtscp**( NULL ); ... C++ compiler should not generate MOV instructions to save 1st value and 2nd value at some addresses.

**[ An example of disassembled codes for a test with RDTSC instruction - 32-bit ]**... 0024AA47 rdtsc 0024AA49 mov ecx, eax 0024AA4B rdtsc 0024AA4D rdtsc 0024AA4F rdtsc 0024AA51 rdtsc 0024AA53 rdtsc 0024AA55 rdtsc 0024AA57 rdtsc 0024AA59 rdtsc 0024AA5B rdtsc 0024AA5D rdtsc 0024AA5F sub eax, ecx ...

