<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Fast estimate for division in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761758#M17245</link>
    <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;Dear GMorris,&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;In the context of vector loops, the Intel compiler can use NR improvement of an approximated division. For example, the loop&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;PROGRAM JOHO&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;REAL Y(16), C(16), X(16)&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;DO I = 1, 16&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;Y(I) = C(I) / X(I)&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;ENDDO&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;END&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;is vectorized into one of the following two versions (depending on the switches, here given in Windows syntax):&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;C&gt; ifort&lt;SPAN&gt; &lt;/SPAN&gt;-Qprec-div- -Qunroll0 -QxP -Fa joho.f&lt;/C&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;yields:&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;$B1$3:&lt;BR /&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$X$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;rcpps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;addps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;subps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, XMMWORD PTR JOHO$C$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;XMMWORD PTR JOHO$Y$0$0[eax], xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;add&lt;SPAN&gt; &lt;/SPAN&gt;eax, 16&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;cmp&lt;SPAN&gt; &lt;/SPAN&gt;eax, 64&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;jb&lt;SPAN&gt; &lt;/SPAN&gt;$B1$3&lt;SPAN&gt; &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;C&gt; ifort&lt;SPAN&gt; &lt;/SPAN&gt;-Qprec-div -Qunroll0 -QxP -Fa joho.f&lt;/C&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;yields:&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;$B1$3:&lt;BR /&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$C$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;divps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$X$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;XMMWORD PTR JOHO$Y$0$0[eax], xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;add&lt;SPAN&gt; &lt;/SPAN&gt;eax, 16&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;cmp&lt;SPAN&gt; &lt;/SPAN&gt;&lt;SPAN&gt;&lt;/SPAN&gt;eax, 64&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT si="" ze="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;jb&lt;SPAN&gt; &lt;/SPAN&gt;$B1$3&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;For various reasons, this division-control is not available at scalar level. Since I also believe it should be, however,I would appreciate your feature request to Premium Support to make my case stronger.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;Aart Bik&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.aartbik.com/" target="_blank"&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;A href="http://www.aartbik.com/" target="_blank"&gt;&lt;/A&gt;&lt;A href="http://www.aartbik.com" target="_blank"&gt;http://www.aartbik.com&lt;/A&gt;/&lt;SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/DIV&gt;</description>
    <pubDate>Tue, 07 Jun 2005 05:57:05 GMT</pubDate>
    <dc:creator>Intel_C_Intel</dc:creator>
    <dc:date>2005-06-07T05:57:05Z</dc:date>
    <item>
      <title>Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761757#M17244</link>
      <description>I was recently introduced to this (Newton-Raphson) technique for getting a faster, lower accuracy estimate of a division, on the PowerPC architecture.&lt;BR /&gt;&lt;BR /&gt;Rather than&lt;BR /&gt;y = c / x&lt;BR /&gt;&lt;BR /&gt;use instead:&lt;BR /&gt;&lt;BR /&gt;temp = fres(x)&lt;BR /&gt;temp = temp * (2 - x * temp)&lt;BR /&gt;y = c * temp&lt;BR /&gt;&lt;BR /&gt;where fres is a PowerPC instruction that gives a single precision estimate of the reciprocal of its argument, and can be called directly by the PowerPC compiler in use. More steps of the N-R iteration&lt;BR /&gt;can be added if needed for more accuracy.&lt;BR /&gt;&lt;BR /&gt;Is it possible to do the same operation (or get the same behaviour)&lt;BR /&gt;using ifort8 on the x86 architecture? The problem is what to use in place of fres(). I've found the SSE instruction rcpss, but it's not possible (?) to inline assembly with ifort, is that right?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks in advance for any help.</description>
      <pubDate>Tue, 07 Jun 2005 05:20:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761757#M17244</guid>
      <dc:creator>gmorris</dc:creator>
      <dc:date>2005-06-07T05:20:57Z</dc:date>
    </item>
    <item>
      <title>Re: Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761758#M17245</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;Dear GMorris,&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;In the context of vector loops, the Intel compiler can use NR improvement of an approximated division. For example, the loop&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;PROGRAM JOHO&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;REAL Y(16), C(16), X(16)&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;DO I = 1, 16&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;Y(I) = C(I) / X(I)&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;ENDDO&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;END&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;is vectorized into one of the following two versions (depending on the switches, here given in Windows syntax):&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;C&gt; ifort&lt;SPAN&gt; &lt;/SPAN&gt;-Qprec-div- -Qunroll0 -QxP -Fa joho.f&lt;/C&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;yields:&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;$B1$3:&lt;BR /&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$X$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;rcpps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;addps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;subps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;mulps&lt;SPAN&gt; &lt;/SPAN&gt;xmm1, XMMWORD PTR JOHO$C$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;XMMWORD PTR JOHO$Y$0$0[eax], xmm1&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;add&lt;SPAN&gt; &lt;/SPAN&gt;eax, 16&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;cmp&lt;SPAN&gt; &lt;/SPAN&gt;eax, 64&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;jb&lt;SPAN&gt; &lt;/SPAN&gt;$B1$3&lt;SPAN&gt; &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;C&gt; ifort&lt;SPAN&gt; &lt;/SPAN&gt;-Qprec-div -Qunroll0 -QxP -Fa joho.f&lt;/C&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;yields:&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;$B1$3:&lt;BR /&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$C$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;divps&lt;SPAN&gt; &lt;/SPAN&gt;xmm0, XMMWORD PTR JOHO$X$0$0[eax]&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;movaps&lt;SPAN&gt; &lt;/SPAN&gt;XMMWORD PTR JOHO$Y$0$0[eax], xmm0&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;add&lt;SPAN&gt; &lt;/SPAN&gt;eax, 16&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;cmp&lt;SPAN&gt; &lt;/SPAN&gt;&lt;SPAN&gt;&lt;/SPAN&gt;eax, 64&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT si="" ze="2"&gt;&lt;SPAN&gt;&lt;/SPAN&gt;jb&lt;SPAN&gt; &lt;/SPAN&gt;$B1$3&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="Times New Roman" size="2"&gt;For various reasons, this division-control is not available at scalar level. Since I also believe it should be, however,I would appreciate your feature request to Premium Support to make my case stronger.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;FONT size="2"&gt;Aart Bik&lt;BR /&gt;&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;A href="http://www.aartbik.com/" target="_blank"&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman" size="2"&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/A&gt;&lt;A href="http://www.aartbik.com/" target="_blank"&gt;&lt;/A&gt;&lt;A href="http://www.aartbik.com" target="_blank"&gt;http://www.aartbik.com&lt;/A&gt;/&lt;SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 07 Jun 2005 05:57:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761758#M17245</guid>
      <dc:creator>Intel_C_Intel</dc:creator>
      <dc:date>2005-06-07T05:57:05Z</dc:date>
    </item>
    <item>
      <title>Re: Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761759#M17246</link>
      <description>Thanks for your reply, I found it very informative. I'll submit a Premier Support issue and see what they say.&lt;BR /&gt;&lt;BR /&gt;Playing around, I see that everything is different when double precision variables are involved. I would also be interested in the ability to use rcpss followed by (eg) two rounds of N-R iteration to get an estimate of the reciprocal of a double precision variable. (I don't know much about these issues, so it's possible this is not a sensible thing to want to do; but it would be interesting to be able to try!). It sounds like this level of control is not available, though - is that right?</description>
      <pubDate>Tue, 07 Jun 2005 20:54:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761759#M17246</guid>
      <dc:creator>gmorris</dc:creator>
      <dc:date>2005-06-07T20:54:21Z</dc:date>
    </item>
    <item>
      <title>Re: Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761760#M17247</link>
      <description>Whoops, that was a bit dumb. I could just make a single precision copy of the double precision variable and work with that, I suppose.</description>
      <pubDate>Tue, 07 Jun 2005 21:03:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761760#M17247</guid>
      <dc:creator>gmorris</dc:creator>
      <dc:date>2005-06-07T21:03:09Z</dc:date>
    </item>
    <item>
      <title>Re: Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761761#M17248</link>
      <description>Aart quoted Windows compiler options in his response.  For the linux compiler, options -mp1 or -prec_div suppress the Newton iteration scheme for single precision parallel and use instead the IEEE accurate instructions.  &lt;BR /&gt;sqrt() sequences also use N-R for vectorization, and the -mp1 (and -mp) are the only options which suppress N-R in those cases.&lt;BR /&gt;The greatest part of the speed gain for the N-R scheme comes because  the IEEE accurate divide and sqrt() block the pipeline.&lt;BR /&gt;A common objection to the N-R scheme is that the various AMD architectures produce different results, due to varying (often better) accuracy of the approximate reciprocal.  The N-R scheme could be modified to produce more often correctly rounded results, giving up some of the performance benefit.&lt;BR /&gt;As I think both of you are hinting, there are non-vectorizable situations where a performance advantage could be obtained by the N-R scheme.</description>
      <pubDate>Tue, 07 Jun 2005 22:54:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761761#M17248</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2005-06-07T22:54:12Z</dc:date>
    </item>
    <item>
      <title>Re: Fast estimate for division</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761762#M17249</link>
      <description>Just for the record, I submitted a feature request to Premier Support about this, as was suggested. They agree that the treatment of vectors and scalars should be consistent, but apparently there is a "related issue" that needs to be solved before this feature can be implemented.</description>
      <pubDate>Mon, 13 Jun 2005 18:46:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Fast-estimate-for-division/m-p/761762#M17249</guid>
      <dc:creator>gmorris</dc:creator>
      <dc:date>2005-06-13T18:46:20Z</dc:date>
    </item>
  </channel>
</rss>

