Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

A fairly trivial question

Deleted_U_Intel
Employee
421 Views

A seemingly trivial question really.

I am trying to calculate the sum of squares or a vector of single precision values whilst maximizing performance and minimizing the accumulated error typically found in summing floats.

Currently I am using a method like the following:

---------8<---------
// Use vspowx to square the values in a vector
vspowx(n, a, 2, y);

// Sum the values in a loop
float sumOfSquares();
for
( int i(); i < n; i++ ){
sumOfSquares += y;
}
---------8<---------

Can anyone think ofa faster and/or more accurate MKL way of summing a vector of floats (in this case absolute values) without casting to doubles ?

0 Kudos
1 Reply
Ilya_B_Intel
Employee
421 Views

In most cases the following example will be faster:

for ( i=0 ; i < n; i++ )
{
sumOfSquares += a*a;
}

If your compiler will not be able to vectorise this code, then the following code might be faster then the previous example (but accuracy will suffer):

sumOfSquares = snrm2(&n, a, &incx );
sumOfSquares *= sumOfSquares;

0 Kudos
Reply