topic RDTSC variability in Core 2 Quad in IntelĀ® ISA Extensions
https://community.intel.com/t5/Intel-ISA-Extensions/RDTSC-variability-in-Core-2-Quad/m-p/853145#M2082
Hi,<BR />I am attempting to measure the overhead of use rdtsc to measure ticks. To do this I set up a loop to have two successive rdtsc calls and then subtract for the difference. I make a kernel module in C to perform this loop:<BR /><BR /> #define ITERS 500<BR /> unsigned long arr[ITERS];<BR /> unsigned long el, sl;<BR /> int i;<BR /> for (i = 0; i < ITERS; i++) {<BR /> asm ( "mov %%cr0, %%edi;"<BR /> "rdtsc;"<BR /> "mov %%eax, %%esi;"<BR /> "mov %%cr0, %%edi;"<BR /> "rdtsc;"<BR /> "mov %%cr0, %%edi;"<BR /> "mov %%eax, %%edi;"<BR /> "mov %%esi, %0;"<BR /> "mov %%edi, %1;"<BR /> :<BR /> "=m" (sl),<BR /> "=m" (el)<BR /> );<BR /> arr<I> = el - sl;<BR /> }<BR /><BR /><BR />I basically sandwich two rdtsc's as close together as possible. I add few instructions<BR />to the mix:<BR /> three moves from cr0's : I use these to serialize, prevent reordering<BR /> an instruction to save the result of the first rdtsc<BR /><BR />The majority of the time I get a consistent value (72 ticks for 1.6GHz Core 2 Quad E5310). However, I occasionally get 66 or 78 ticks.<BR /><BR />I have two questions:<BR />1) The tick counts always seem to be multiples of 6 ticks. I am assuming this is becuase ticks are measured at the front bus cycles (1066 MHz, Quad pumped => 266 MHz) and multipled by the front bus-to-core frequency ratio (which is 6 for this processor). Is this correct?<BR /><BR />2) It seems for this simple loop that there should be no variations. However, there are frequent
ly outliers in these measurements. Are there non-deterministic factors I am missing here?<BR /><BR />Thanks for any help you can offer,<BR />Andrew<BR /></I>Sat, 09 Feb 2008 03:31:00 GMTagame2008-02-09T03:31:00ZRDTSC variability in Core 2 Quad
https://community.intel.com/t5/Intel-ISA-Extensions/RDTSC-variability-in-Core-2-Quad/m-p/853145#M2082
Hi,<BR />I am attempting to measure the overhead of use rdtsc to measure ticks. To do this I set up a loop to have two successive rdtsc calls and then subtract for the difference. I make a kernel module in C to perform this loop:<BR /><BR /> #define ITERS 500<BR /> unsigned long arr[ITERS];<BR /> unsigned long el, sl;<BR /> int i;<BR /> for (i = 0; i < ITERS; i++) {<BR /> asm ( "mov %%cr0, %%edi;"<BR /> "rdtsc;"<BR /> "mov %%eax, %%esi;"<BR /> "mov %%cr0, %%edi;"<BR /> "rdtsc;"<BR /> "mov %%cr0, %%edi;"<BR /> "mov %%eax, %%edi;"<BR /> "mov %%esi, %0;"<BR /> "mov %%edi, %1;"<BR /> :<BR /> "=m" (sl),<BR /> "=m" (el)<BR /> );<BR /> arr<I> = el - sl;<BR /> }<BR /><BR /><BR />I basically sandwich two rdtsc's as close together as possible. I add few instructions<BR />to the mix:<BR /> three moves from cr0's : I use these to serialize, prevent reordering<BR /> an instruction to save the result of the first rdtsc<BR /><BR />The majority of the time I get a consistent value (72 ticks for 1.6GHz Core 2 Quad E5310). However, I occasionally get 66 or 78 ticks.<BR /><BR />I have two questions:<BR />1) The tick counts always seem to be multiples of 6 ticks. I am assuming this is becuase ticks are measured at the front bus cycles (1066 MHz, Quad pumped => 266 MHz) and multipled by the front bus-to-core frequency ratio (which is 6 for this processor). Is this correct?<BR /><BR />2) It seems for this simple loop that there should be no variations. However, there are frequent
ly outliers in these measurements. Are there non-deterministic factors I am missing here?<BR /><BR />Thanks for any help you can offer,<BR />Andrew<BR /></I>Sat, 09 Feb 2008 03:31:00 GMThttps://community.intel.com/t5/Intel-ISA-Extensions/RDTSC-variability-in-Core-2-Quad/m-p/853145#M2082agame2008-02-09T03:31:00ZRe: RDTSC variability in Core 2 Quad
https://community.intel.com/t5/Intel-ISA-Extensions/RDTSC-variability-in-Core-2-Quad/m-p/853146#M2083
Yes, all 64-bit Intel CPUs have used the multiplied FSB tick count method in rdtsc.<BR />I don't know enough to explain the variation, but I'm not surprised. I usually time a much longer group of instructions, >= 0.1 microseconds, without bothering with serialization.<BR />Does it make a difference if you set affinity to a single core?<BR />Sat, 09 Feb 2008 14:21:11 GMThttps://community.intel.com/t5/Intel-ISA-Extensions/RDTSC-variability-in-Core-2-Quad/m-p/853146#M2083TimP2008-02-09T14:21:11Z