<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Functions &amp; Macro-Wrapper using RDTSC instruction in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787558#M490</link>
    <description>For several years, Microsoft and Intel compilers have a built-in macro __rdtsc which handles all the platforms.&lt;BR /&gt;</description>
    <pubDate>Thu, 24 Nov 2011 13:14:09 GMT</pubDate>
    <dc:creator>TimP</dc:creator>
    <dc:date>2011-11-24T13:14:09Z</dc:date>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787557#M489</link>
      <description>Hi everybody,&lt;BR /&gt;&lt;BR /&gt;Here are a couple of functions &amp;amp;macro-wrapper to measure time intervals with as better as possible accuracyusing &lt;STRONG&gt;RDTSC&lt;/STRONG&gt; instruction:&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;//*** Version 1 - Win32 ***//&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;template &amp;lt; class T &amp;gt; inline _RTdeclspec_naked RTuint64 HrtClock( RTvoid )&lt;BR /&gt;{&lt;BR /&gt; //printf( "[ HrtClock ]\\n" );&lt;/P&gt;&lt;P&gt; _asm&lt;BR /&gt; {&lt;BR /&gt; rdtsc&lt;BR /&gt; ret&lt;BR /&gt; }&lt;BR /&gt;};&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;//*** Version 2 - Win32 ***//&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;_RTdeclspec_naked RTuint64 HrtClock( void )&lt;BR /&gt;{&lt;BR /&gt; //printf( "[ HrtClock ]\\n" );&lt;/P&gt;&lt;P&gt; _asm&lt;BR /&gt; {&lt;BR /&gt; rdtsc&lt;BR /&gt; ret&lt;BR /&gt; }&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;//*** Version 3 - Win32 ***//&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;void HrtClock( RTuint64 *puiClock ) &lt;BR /&gt;{&lt;BR /&gt; //printf( "[ HrtClock ]\\n" );&lt;/P&gt;&lt;P&gt; RTuint64 v;&lt;BR /&gt; _asm&lt;BR /&gt; {&lt;BR /&gt; push eax&lt;BR /&gt; push edx&lt;BR /&gt; _emit 0x0f&lt;BR /&gt; _emit 0x31&lt;BR /&gt; mov dword ptr v, eax&lt;BR /&gt; mov dword ptr v+4, edx&lt;BR /&gt; pop edx&lt;BR /&gt; pop eax&lt;BR /&gt; }&lt;BR /&gt; *puiClock = v;&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;//*** Version 4 - Linux32 ( with GCC or MinGW ) ***//&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;RTclock_t HrtClock( RTvoid )&lt;BR /&gt;{&lt;BR /&gt; //printf( "[ HrtClock ]\\n" );&lt;/P&gt;&lt;P&gt; RTclock_t ctValue;&lt;BR /&gt; __asm__&lt;BR /&gt; (&lt;BR /&gt; "rdtsc;"&lt;BR /&gt; "mov %%edx, %%ecx;" : "=a" ( ctValue )&lt;BR /&gt; );&lt;BR /&gt; return ( RTclock_t )ctValue;&lt;BR /&gt;};&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;//*** Version 5 - Linux32 ( with GCC or MinGW ) ***//&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;#define HrtClock( Value ) \\&lt;BR /&gt;(\\&lt;BR /&gt; {\\&lt;BR /&gt; __asm__ __volatile__\\&lt;BR /&gt; (\\&lt;BR /&gt; ".byte 0x0f; .byte 0x31" \\&lt;BR /&gt; : "=A" ( Value ) \\&lt;BR /&gt; );\\&lt;BR /&gt; }\\&lt;BR /&gt;)\\&lt;/P&gt;</description>
      <pubDate>Thu, 24 Nov 2011 02:33:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787557#M489</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-11-24T02:33:25Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787558#M490</link>
      <description>For several years, Microsoft and Intel compilers have a built-in macro __rdtsc which handles all the platforms.&lt;BR /&gt;</description>
      <pubDate>Thu, 24 Nov 2011 13:14:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787558#M490</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2011-11-24T13:14:09Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787559#M491</link>
      <description>It is declared in '&lt;STRONG&gt;intrin.h&lt;/STRONG&gt;' header file and actually this is an intrinsic function ( &lt;SPAN style="text-decoration: underline;"&gt;not a macro&lt;/SPAN&gt; ):&lt;BR /&gt;&lt;BR /&gt;...&lt;BR /&gt;unsigned __int64 &lt;STRONG&gt;__rdtsc&lt;/STRONG&gt;( void );&lt;BR /&gt;...&lt;BR /&gt;</description>
      <pubDate>Fri, 25 Nov 2011 14:53:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787559#M491</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-11-25T14:53:11Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787560#M492</link>
      <description>This is a follow up because I've detected&lt;SPAN style="text-decoration: underline;"&gt;someissues&lt;/SPAN&gt; with &lt;STRONG&gt;Version 4&lt;/STRONG&gt; example for &lt;STRONG&gt;Linux32&lt;/STRONG&gt; with &lt;STRONG&gt;GCC&lt;/STRONG&gt; or&lt;BR /&gt;&lt;STRONG&gt;MinGW&lt;/STRONG&gt; C/C++ compilers&lt;BR /&gt;&lt;BR /&gt;This is what &lt;STRONG&gt;Intel&lt;/STRONG&gt;'s documentation says about &lt;STRONG&gt;RDTSC&lt;/STRONG&gt; instruction in an '&lt;STRONG&gt;Instruction Set Reference&lt;/STRONG&gt;' &lt;SPAN style="text-decoration: underline;"&gt;Volume 2B&lt;/SPAN&gt;:&lt;BR /&gt;&lt;BR /&gt; ...&lt;BR /&gt; Loads the current value of the processor's time-stamp counter ( a 64-bit MSR ) &lt;SPAN style="text-decoration: underline;"&gt;into the&lt;/SPAN&gt; &lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;EDX:EAX&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt; &lt;SPAN style="text-decoration: underline;"&gt;registers&lt;/SPAN&gt;.&lt;BR /&gt; ...&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Found issues are as follows:&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;1.&lt;/STRONG&gt; On a &lt;STRONG&gt;32-bit&lt;/STRONG&gt; platforms '&lt;STRONG&gt;clock_t&lt;/STRONG&gt;' type is declared as a&lt;STRONG&gt;32-bit&lt;/STRONG&gt; type '&lt;STRONG&gt;long&lt;/STRONG&gt;' ( 4 bytes ) and it is verified with&lt;BR /&gt; all &lt;STRONG&gt;32-bit&lt;/STRONG&gt;C/C++ compilers I use:&lt;BR /&gt;&lt;BR /&gt; ...&lt;BR /&gt; typedef &lt;STRONG&gt;long&lt;/STRONG&gt; clock_t;&lt;BR /&gt; ...&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;2.&lt;/STRONG&gt; "&lt;STRONG&gt;=A&lt;/STRONG&gt;" has to be used instead of "&lt;STRONG&gt;=a&lt;/STRONG&gt;"&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;3.&lt;/STRONG&gt; A pair of registers '&lt;STRONG&gt;edx&lt;/STRONG&gt;' and '&lt;STRONG&gt;ecx&lt;/STRONG&gt;'is used instead of '&lt;STRONG&gt;edx&lt;/STRONG&gt;' and '&lt;STRONG&gt;eax&lt;/STRONG&gt;', and if you change:&lt;BR /&gt;&lt;BR /&gt; ...&lt;BR /&gt; "mov &lt;SPAN style="text-decoration: underline;"&gt;%%edx, %%ecx&lt;/SPAN&gt;;" : "=A" ( ctValue )&lt;BR /&gt; ...&lt;BR /&gt; to&lt;BR /&gt; ...&lt;BR /&gt; "mov &lt;SPAN style="text-decoration: underline;"&gt;%%edx, %%eax&lt;/SPAN&gt;;" : "=A" ( ctValue )&lt;BR /&gt; ...&lt;BR /&gt;&lt;BR /&gt; it &lt;SPAN style="text-decoration: underline;"&gt;doesn't work&lt;/SPAN&gt;! So, theregisters '&lt;STRONG&gt;edx&lt;/STRONG&gt;' and '&lt;STRONG&gt;ecx&lt;/STRONG&gt;'have to beused anyway&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;4.&lt;/STRONG&gt; Finally, a working C/C++ code looks like this:&lt;BR /&gt;&lt;BR /&gt; inline RTuint64 &lt;STRONG&gt;HrtClock&lt;/STRONG&gt;( RTvoid )&lt;BR /&gt; {&lt;BR /&gt; RTuint64 uiValue;&lt;BR /&gt; __asm__&lt;BR /&gt; (&lt;BR /&gt; "&lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;"&lt;BR /&gt; "mov %%edx, %%ecx;" : "=A" ( uiValue )&lt;BR /&gt; );&lt;BR /&gt; return ( RTuint64 )uiValue;&lt;BR /&gt; }&lt;BR /&gt;</description>
      <pubDate>Sun, 22 Jan 2012 19:51:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787560#M492</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-01-22T19:51:20Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787561#M493</link>
      <description>&lt;P&gt;Here are some assembler codes generated by a '&lt;STRONG&gt;g++&lt;/STRONG&gt;' C/C++ compiler.&lt;/P&gt;&lt;P&gt;The left example &lt;SPAN style="text-decoration: underline;"&gt;doesn't work&lt;/SPAN&gt;. The right example &lt;SPAN style="text-decoration: underline;"&gt;works&lt;/SPAN&gt;.&lt;/P&gt;&lt;P&gt;... ...&lt;BR /&gt;.stabn 68,0,284,LM5832-__Z8HrtClockv .stabn 68,0,284,LM5832-__Z8HrtClockv&lt;BR /&gt;LM5832: LM5832:&lt;BR /&gt;/APP  /APP&lt;BR /&gt;&lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;mov %&lt;STRONG&gt;edx&lt;/STRONG&gt;, %&lt;STRONG&gt;eax&lt;/STRONG&gt;;&lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;mov %&lt;STRONG&gt;edx&lt;/STRONG&gt;, %&lt;STRONG&gt;ecx&lt;/STRONG&gt;;&lt;BR /&gt;/NO_APP /NO_APP&lt;BR /&gt;movl%eax, -8(%ebp)  movl%eax, -8(%ebp)&lt;BR /&gt;movl%edx, -4(%ebp)  movl%edx, -4(%ebp)&lt;BR /&gt;.stabn 68,0,287,LM5833-__Z8HrtClockv .stabn 68,0,287,LM5833-__Z8HrtClockv&lt;BR /&gt;LM5833: LM5833:&lt;BR /&gt;movl-8(%ebp), %eax  movl-8(%ebp), %eax&lt;BR /&gt;movl-4(%ebp), %edx  movl-4(%ebp), %edx&lt;BR /&gt;LBE895: LBE895:&lt;BR /&gt;LBE894: LBE894:&lt;BR /&gt;.stabn 68,0,288,LM5834-__Z8HrtClockv .stabn 68,0,288,LM5834-__Z8HrtClockv&lt;BR /&gt;LM5834: LM5834:&lt;BR /&gt;leave  leave&lt;BR /&gt;ret  ret&lt;BR /&gt;... ...&lt;/P&gt;&lt;P&gt;It is not clear why the left example doesn't work and it always returns 0.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jan 2012 00:03:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787561#M493</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-01-24T00:03:15Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787562#M494</link>
      <description>&lt;P&gt;It is worth mentioning the problem with RDTSC and out of order execution:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="http://en.wikipedia.org/wiki/Time_Stamp_Counter"&gt;http://en.wikipedia.org/wiki/Time_Stamp_Counter&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Starting with the Pentium Pro, Intel processors have supported out-of-order execution, where instructions are not necessarily performed in the order they appear in the executable. This can cause RDTSC to be executed later than expected, producing a misleading cycle count.&lt;SUP class="reference" id="cite_ref-2"&gt;&lt;A href="http://en.wikipedia.org/wiki/Time_Stamp_Counter#cite_note-2"&gt;[3]&lt;/A&gt;&lt;/SUP&gt; This problem can be solved by executing a serializing instruction, such as &lt;A href="http://en.wikipedia.org/wiki/CPUID" title="CPUID"&gt;CPUID&lt;/A&gt;, to force every preceding instruction to complete before allowing the program to continue, or by using the RDTSCP instruction, which is a serializing variant of the RDTSC instruction (starting from Core i7&lt;SUP class="reference" id="cite_ref-3"&gt;&lt;A href="http://en.wikipedia.org/wiki/Time_Stamp_Counter#cite_note-3"&gt;[4]&lt;/A&gt;&lt;/SUP&gt; and starting from AMD Athlon 64 X2 CPUs with AM2 Socket (Windsor &amp;amp; Brisbane)).&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jan 2012 18:18:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787562#M494</guid>
      <dc:creator>A_T_Intel</dc:creator>
      <dc:date>2012-01-24T18:18:35Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787563#M495</link>
      <description>Thank you, Perry.&lt;BR /&gt;&lt;BR /&gt;Unfortunately, I can'tuse &lt;STRONG&gt;RDTSCP&lt;/STRONG&gt; instruction.&lt;BR /&gt;&lt;BR /&gt;I had a problem withincorrect values returned from '&lt;STRONG&gt;HrtClock&lt;/STRONG&gt;' function that uses &lt;STRONG&gt;RTDSC&lt;/STRONG&gt; instruction.It&lt;BR /&gt;was related to a declaration of return value as'&lt;STRONG&gt;clock_t&lt;/STRONG&gt;' ( long \ 32-bit )instead of '&lt;STRONG&gt;uint64&lt;/STRONG&gt;' ( 64-bit ).&lt;BR /&gt;&lt;BR /&gt;I wanted to inform that &lt;SPAN style="text-decoration: underline;"&gt;possibly&lt;/SPAN&gt;there is a bug in the &lt;STRONG&gt;MinGW&lt;/STRONG&gt; C/C++ compiler related to '%%&lt;STRONG&gt;edx&lt;/STRONG&gt;, %%&lt;STRONG&gt;ecx&lt;/STRONG&gt;' declaration.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Jan 2012 02:37:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787563#M495</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-01-25T02:37:43Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787564#M496</link>
      <description>I've encountered no difficulty with rdtsc under mingw:&lt;BR /&gt;&lt;PRE&gt;[bash]unsigned long long int rdtsc( )
{
[/bash]&lt;/PRE&gt; ....&lt;BR /&gt;&lt;PRE&gt;[bash]#elif defined(__GNUC__)
#if defined i386
   long long a;
   asm volatile("rdtsc":"=A" (a));
   return a;
#elif defined __x86_64
   unsigned int _hi,_lo;
   asm volatile("rdtsc":"=a"(_lo),"=d"(_hi));
   return ((unsigned long long int)_hi &amp;lt;&amp;lt; 32) | _lo;&lt;BR /&gt;&lt;BR /&gt;.... &lt;BR /&gt;[/bash]&lt;/PRE&gt; Evidently, tick count differences should be taken in 64-bit unsigned arithmetic.&lt;BR /&gt;The left shift operator displays correctly if you go to edit mode.</description>
      <pubDate>Wed, 25 Jan 2012 12:27:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787564#M496</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2012-01-25T12:27:42Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787565#M497</link>
      <description>Thank you, Tim. I'll try your example of using &lt;STRONG&gt;RDTSC&lt;/STRONG&gt; instruction.&lt;BR /&gt;&lt;BR /&gt;&amp;gt;&amp;gt;...Evidently, tick count differences &lt;SPAN style="text-decoration: underline;"&gt;should be taken in 64-bit unsigned arithmetic&lt;/SPAN&gt;...&lt;BR /&gt;&lt;BR /&gt;Always! When I tried to use '&lt;STRONG&gt;clock_t&lt;/STRONG&gt;', type'&lt;STRONG&gt;long&lt;/STRONG&gt;' 32-bit on a32-bit platform, I had &lt;SPAN style="text-decoration: underline;"&gt;sometimes&lt;/SPAN&gt; absolutely&lt;BR /&gt;incorrect values returned from my'&lt;STRONG&gt;HrtClock&lt;/STRONG&gt;' function.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Wed, 25 Jan 2012 14:54:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787565#M497</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-01-25T14:54:54Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787566#M498</link>
      <description>Yes, 32-bit differences (taken from the low order 32 bits) will work until the counter has a carry into the high order bits, which takes very few seconds.</description>
      <pubDate>Wed, 25 Jan 2012 17:31:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787566#M498</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2012-01-25T17:31:47Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787567#M499</link>
      <description>&lt;P&gt;I decided to use&lt;/P&gt;&lt;P&gt; __asm__ volatile( "&lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;" : "=A" ( uiValue ) );  ( &lt;STRONG&gt;version B&lt;/STRONG&gt; )&lt;/P&gt;&lt;P&gt;instead of&lt;/P&gt;&lt;P&gt; __asm__( "&lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;"&lt;BR /&gt;  "&lt;STRONG&gt;mov %%edx, %%ecx&lt;/STRONG&gt;;" : "=A" ( uiValue ) );( &lt;STRONG&gt;version A&lt;/STRONG&gt; )&lt;/P&gt;&lt;P&gt;because '&lt;STRONG&gt;version B&lt;/STRONG&gt;' doesn't have '&lt;SPAN style="text-decoration: underline;"&gt;mov %%edx, %%ecx&lt;/SPAN&gt;'.&lt;/P&gt;&lt;P&gt;In terms of assembler codes compiled by a &lt;STRONG&gt;MinGW&lt;/STRONG&gt; C/C++ compiler I haven't found any problems&lt;BR /&gt;with the&lt;STRONG&gt;version B&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; ...&lt;BR /&gt; .stabn 68,0,244,LM2990-__Z8HrtClockv&lt;BR /&gt; LM2990:&lt;BR /&gt; pushl %ebp&lt;BR /&gt; movl %esp, %ebp&lt;BR /&gt; subl $8, %esp&lt;BR /&gt; LBB548:&lt;BR /&gt; LBB549:&lt;BR /&gt; .stabn 68,0,258,LM2991-__Z8HrtClockv&lt;BR /&gt; LM2991:&lt;BR /&gt; /APP&lt;BR /&gt; &lt;STRONG&gt;rdtsc&lt;/STRONG&gt;;&lt;BR /&gt; /NO_APP&lt;BR /&gt; movl %eax, -8(%ebp)&lt;BR /&gt; movl %edx, -4(%ebp)&lt;BR /&gt; .stabn 68,0,260,LM2992-__Z8HrtClockv&lt;BR /&gt; LM2992:&lt;BR /&gt; movl -8(%ebp), %eax&lt;BR /&gt; movl -4(%ebp), %edx&lt;BR /&gt; LBE549:&lt;BR /&gt; LBE548:&lt;BR /&gt; .stabn 68,0,261,LM2993-__Z8HrtClockv&lt;BR /&gt; LM2993:&lt;BR /&gt; leave&lt;BR /&gt; ret&lt;BR /&gt; .stabs "uiValue:(93,22)",128,0,249,-8&lt;BR /&gt; ...&lt;/P&gt;&lt;P&gt;and it works. No runtimeissues or problems detected so far.&lt;/P&gt;&lt;P&gt;Best regards,&lt;BR /&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jan 2012 14:08:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787567#M499</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-01-27T14:08:13Z</dc:date>
    </item>
    <item>
      <title>Functions &amp; Macro-Wrapper using RDTSC instruction</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787568#M500</link>
      <description>&lt;P&gt;This is a follow up. All three widely used &lt;STRONG&gt;Visual Studios 2005&lt;/STRONG&gt;, &lt;STRONG&gt;2008&lt;/STRONG&gt; and &lt;STRONG&gt;2010&lt;/STRONG&gt; have a declaration of '&lt;STRONG&gt;__rdtsc&lt;/STRONG&gt;' intrinsic function&lt;BR /&gt;in '&lt;STRONG&gt;intrin.h&lt;/STRONG&gt;' header file:&lt;/P&gt;&lt;P&gt; ...&lt;BR /&gt; __MACHINEI( unsigned __int64 &lt;STRONG&gt;__rdtsc&lt;/STRONG&gt;( void ) )&lt;BR /&gt; ...&lt;/P&gt;</description>
      <pubDate>Sun, 26 Feb 2012 00:46:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Functions-Macro-Wrapper-using-RDTSC-instruction/m-p/787568#M500</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-26T00:46:06Z</dc:date>
    </item>
  </channel>
</rss>

