<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: transcendental speed in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888770#M10213</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Great. Thanks very much!!!!&lt;BR /&gt;</description>
    <pubDate>Fri, 21 Aug 2009 21:31:58 GMT</pubDate>
    <dc:creator>mrentropy1</dc:creator>
    <dc:date>2009-08-21T21:31:58Z</dc:date>
    <item>
      <title>transcendental speed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888768#M10211</link>
      <description>This might not belong in MKL forum, but I'm not sure where else to put it - sorry.&lt;BR /&gt;&lt;BR /&gt;Anybody know how transcendental function evaluation on Intel 64 compares with floating-point divide, for 64-bit float? I have a transformation I could write with sines and cosines, or I could do it differently and use a simple divide - but doing it that way requries a lot more work on my part.... This is in a numerically-intensive code in what I think may be a significant bottleneck, so faster is better. This will be parallel operations on a very large array.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Peter&lt;BR /&gt;&lt;BR /&gt;P.S. Based on my own timing I get mult : divide : sqrt() : sin() speed ratio of&lt;BR /&gt;1 : 1.7 : 2.3 : 5.6&lt;BR /&gt;for 32- bit and&lt;BR /&gt;1 : 2.8 : 3.3 : 6.7&lt;BR /&gt;for 64 bit, on a Core2Duo, but I'm not sure if/when that translates into raw clock cycle ratios...., and how other factors might have affected my measurement. That's using Intel Fortran with no compiler flags.&lt;BR /&gt;</description>
      <pubDate>Fri, 21 Aug 2009 19:59:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888768#M10211</guid>
      <dc:creator>mrentropy1</dc:creator>
      <dc:date>2009-08-21T19:59:06Z</dc:date>
    </item>
    <item>
      <title>Re: transcendental speed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888769#M10212</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
ifort defaults to enabling auto-vectorization, with calls to svml (short vector) math library. If you have vectorizable loops several thousand elements long, the VML library in MKL might do better. You could look up quoted performance for VML. Anyway, the numbers you quote look reasonable as a rough guide for scalar code.&lt;BR /&gt;</description>
      <pubDate>Fri, 21 Aug 2009 21:03:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888769#M10212</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-08-21T21:03:24Z</dc:date>
    </item>
    <item>
      <title>Re: transcendental speed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888770#M10213</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Great. Thanks very much!!!!&lt;BR /&gt;</description>
      <pubDate>Fri, 21 Aug 2009 21:31:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888770#M10213</guid>
      <dc:creator>mrentropy1</dc:creator>
      <dc:date>2009-08-21T21:31:58Z</dc:date>
    </item>
    <item>
      <title>Re: transcendental speed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888771#M10214</link>
      <description>&lt;DIV style="margin:0px;"&gt;The following page (&lt;A href="http://www.intel.com/software/products/mkl/data/vml/functions/_performanceall.htm"&gt;http://www.intel.com/software/products/mkl/data/vml/functions/_performanceall.htm&lt;/A&gt;) gives the cycle counts (per element)for the Intel MKL vector math library functions. A quick review of it will likely provide you the insights you need on the best way to code your algorithm. Regards, Shane&lt;/DIV&gt;</description>
      <pubDate>Fri, 21 Aug 2009 21:48:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/transcendental-speed/m-p/888771#M10214</guid>
      <dc:creator>Shane_S_Intel</dc:creator>
      <dc:date>2009-08-21T21:48:26Z</dc:date>
    </item>
  </channel>
</rss>

