Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6956 Discussions

Multiplying large float array with a scalar float

unpocoloco1
Beginner
252 Views
Hi all,
I've been using the Intel Math Kernel Library for a full day now, so I apologize if this is a newbie question. Hopefully it is, actually, and there's an easy answer for it.

I have a large (1-D) float array of data. I am running a filter on the data which essentially just modifies each element of data as a sum of current & previous inputs, multiplied by scalar coefficients (filter taps), plus current and previous outputs, multiplied by filter taps. Kind of like this:

// x's are my inputs, y's are my outputs, a's and b's are the coefficients
float* pData = &myLargeFloatArray[0];
float x0=0, x1=0, y0=0, y1=0;
float a0, a1, b0, b1; // Pretend these are filled in to whatever values
for (DWORD i=0; i x0 = pData;
y0 = x0*a0 + x1*a1 + y1*b1;
x1 = x0;
y1 = y0;
}



Okay. Now in the above for loop, this gets VERY inefficient when the array size gets large. I assume the line calculating y0 is especially slow. Is there a more efficient way of doing this, other than element by element?

My thoughts:
1. Maybe there is a function I don't know about that can quickly multiply a scalar times a vector (both floats). vsMul needed two vectors of the same length, and when I tried creating an array containing copies of a0, a1, and b1, I didn't see any performance improvement.
2. Maybe there are some functions that do IIR filtering?


Thanks for your help! I greatly appreciate it.
0 Kudos
1 Reply
TimP
Honored Contributor III
252 Views
According to what you show, the performance bottleneck would be in the serial dependency in the calculation of y1. I've seen cases where icc would optimize this better with #pragma unroll(4), to cut the time spent on store forwarding.
0 Kudos
Reply