<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Multiplying large float array with a scalar float in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Multiplying-large-float-array-with-a-scalar-float/m-p/916121#M12603</link>
    <description>Hi all,&lt;BR /&gt;I've been using the Intel Math Kernel Library for a full day now, so I apologize if this is a newbie question. Hopefully it is, actually, and there's an easy answer for it.&lt;BR /&gt;&lt;BR /&gt;I have a large (1-D) float array of data. I am running a filter on the data which essentially just modifies each element of data as a sum of current &amp;amp; previous inputs, multiplied by scalar coefficients (filter taps), plus current and previous outputs, multiplied by filter taps. Kind of like this:&lt;BR /&gt;&lt;BR /&gt;// x's are my inputs, y's are my outputs, a's and b's are the coefficients&lt;BR /&gt;float* pData = &amp;amp;myLargeFloatArray[0];&lt;BR /&gt;float x0=0, x1=0, y0=0, y1=0;&lt;BR /&gt;float a0, a1, b0, b1; // Pretend these are filled in to whatever values&lt;BR /&gt;&lt;B&gt;for (DWORD i=0; i&lt;LENGTHOFMYARRAY&gt; x0 = pData&lt;I&gt;;&lt;BR /&gt; &lt;I&gt;y0 = x0*a0 + x1*a1 + y1*b1;&lt;/I&gt;&lt;BR /&gt; x1 = x0;&lt;BR /&gt; y1 = y0;&lt;BR /&gt;}&lt;/I&gt;&lt;/LENGTHOFMYARRAY&gt;&lt;/B&gt;&lt;I&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Okay. Now in the above for loop, this gets VERY inefficient when the array size gets large. I assume the line calculating y0 is especially slow. Is there a more efficient way of doing this, other than element by element?&lt;BR /&gt;&lt;BR /&gt;My thoughts:&lt;BR /&gt;1. Maybe there is a function I don't know about that can quickly multiply a scalar times a vector (both floats). vsMul needed two vectors of the same length, and when I tried creating an array containing copies of a0, a1, and b1, I didn't see any performance improvement.&lt;BR /&gt;2. Maybe there are some functions that do IIR filtering?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks for your help! I greatly appreciate it.&lt;BR /&gt;&lt;/I&gt;</description>
    <pubDate>Fri, 22 Feb 2008 00:53:47 GMT</pubDate>
    <dc:creator>unpocoloco1</dc:creator>
    <dc:date>2008-02-22T00:53:47Z</dc:date>
    <item>
      <title>Multiplying large float array with a scalar float</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Multiplying-large-float-array-with-a-scalar-float/m-p/916121#M12603</link>
      <description>Hi all,&lt;BR /&gt;I've been using the Intel Math Kernel Library for a full day now, so I apologize if this is a newbie question. Hopefully it is, actually, and there's an easy answer for it.&lt;BR /&gt;&lt;BR /&gt;I have a large (1-D) float array of data. I am running a filter on the data which essentially just modifies each element of data as a sum of current &amp;amp; previous inputs, multiplied by scalar coefficients (filter taps), plus current and previous outputs, multiplied by filter taps. Kind of like this:&lt;BR /&gt;&lt;BR /&gt;// x's are my inputs, y's are my outputs, a's and b's are the coefficients&lt;BR /&gt;float* pData = &amp;amp;myLargeFloatArray[0];&lt;BR /&gt;float x0=0, x1=0, y0=0, y1=0;&lt;BR /&gt;float a0, a1, b0, b1; // Pretend these are filled in to whatever values&lt;BR /&gt;&lt;B&gt;for (DWORD i=0; i&lt;LENGTHOFMYARRAY&gt; x0 = pData&lt;I&gt;;&lt;BR /&gt; &lt;I&gt;y0 = x0*a0 + x1*a1 + y1*b1;&lt;/I&gt;&lt;BR /&gt; x1 = x0;&lt;BR /&gt; y1 = y0;&lt;BR /&gt;}&lt;/I&gt;&lt;/LENGTHOFMYARRAY&gt;&lt;/B&gt;&lt;I&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Okay. Now in the above for loop, this gets VERY inefficient when the array size gets large. I assume the line calculating y0 is especially slow. Is there a more efficient way of doing this, other than element by element?&lt;BR /&gt;&lt;BR /&gt;My thoughts:&lt;BR /&gt;1. Maybe there is a function I don't know about that can quickly multiply a scalar times a vector (both floats). vsMul needed two vectors of the same length, and when I tried creating an array containing copies of a0, a1, and b1, I didn't see any performance improvement.&lt;BR /&gt;2. Maybe there are some functions that do IIR filtering?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks for your help! I greatly appreciate it.&lt;BR /&gt;&lt;/I&gt;</description>
      <pubDate>Fri, 22 Feb 2008 00:53:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Multiplying-large-float-array-with-a-scalar-float/m-p/916121#M12603</guid>
      <dc:creator>unpocoloco1</dc:creator>
      <dc:date>2008-02-22T00:53:47Z</dc:date>
    </item>
    <item>
      <title>Re: Multiplying large float array with a scalar float</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Multiplying-large-float-array-with-a-scalar-float/m-p/916122#M12604</link>
      <description>According to what you show, the performance bottleneck would be in the serial dependency in the calculation of y1. I've seen cases where icc would optimize this better with #pragma unroll(4), to cut the time spent on store forwarding.&lt;BR /&gt;</description>
      <pubDate>Fri, 22 Feb 2008 01:52:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Multiplying-large-float-array-with-a-scalar-float/m-p/916122#M12604</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2008-02-22T01:52:20Z</dc:date>
    </item>
  </channel>
</rss>

