<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic floating point operations for 1/r in Intel® ISA Extensions</title>
    <link>https://community.intel.com/t5/Intel-ISA-Extensions/floating-point-operations-for-1-r/m-p/772485#M144</link>
    <description>Hi,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;if i have two 3-d vectors x([x1,x2,x3]) and y,&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;then how many flops needed to get 1/|x-y| using icc?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I may interpret this functions as rsqrt((x1-y1)^2+(x2-y2)^2+(x3-y3)^2)...&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;thanks&lt;/DIV&gt;</description>
    <pubDate>Thu, 02 Dec 2010 19:34:25 GMT</pubDate>
    <dc:creator>pilot117</dc:creator>
    <dc:date>2010-12-02T19:34:25Z</dc:date>
    <item>
      <title>floating point operations for 1/r</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/floating-point-operations-for-1-r/m-p/772485#M144</link>
      <description>Hi,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;if i have two 3-d vectors x([x1,x2,x3]) and y,&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;then how many flops needed to get 1/|x-y| using icc?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I may interpret this functions as rsqrt((x1-y1)^2+(x2-y2)^2+(x3-y3)^2)...&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;thanks&lt;/DIV&gt;</description>
      <pubDate>Thu, 02 Dec 2010 19:34:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/floating-point-operations-for-1-r/m-p/772485#M144</guid>
      <dc:creator>pilot117</dc:creator>
      <dc:date>2010-12-02T19:34:25Z</dc:date>
    </item>
    <item>
      <title>floating point operations for 1/r</title>
      <link>https://community.intel.com/t5/Intel-ISA-Extensions/floating-point-operations-for-1-r/m-p/772486#M145</link>
      <description>I assume you're using single-precision, otherwise asking for rsqrt wouldn't make much sense...&lt;BR /&gt;&lt;BR /&gt;It's not a question of the compiler you use, but of the CPU. With any recent SSE-capable CPU you will get the following if you put a 3D vector in one SSE register (which I can't generally recommend):&lt;BR /&gt;one subtraction (3 cycles/4 on AMD), then one multiplication (4 cycles), then two horizontal add instructions (2x 5 cycles), and then rsqrt (3 cycles). &lt;BR /&gt;&lt;BR /&gt;If you had 4 x vectors and 4 y vectors this could be improved by putting the x1 values in one SSE register, the x2 values in another and so on. Then you'd calculate 3 subtractions which can be pipelined, 3 multipliations which can be pipelined, two additions and one rsqrt. Since the vertical additions are faster than the horizontal additions you'd get the result of 4 x and y vectors in basically the same time you got the one result with the vertical vectorization.&lt;BR /&gt;&lt;BR /&gt;Cheers,&lt;BR /&gt;  Matthias</description>
      <pubDate>Thu, 02 Dec 2010 20:55:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-ISA-Extensions/floating-point-operations-for-1-r/m-p/772486#M145</guid>
      <dc:creator>Matthias_Kretz</dc:creator>
      <dc:date>2010-12-02T20:55:28Z</dc:date>
    </item>
  </channel>
</rss>

