Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library
- advice for accuracy loss on vector subtraction

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-18-2012
11:38 PM

116 Views

advice for accuracy loss on vector subtraction

In my code, I am doing some vector subtractions in the form of

t = t - alpha * v

where t and v are vectors and alpha is a scalar. I am using daxpy for this however for some problems the norm of the resulting vector t, differs significantly than the value of the norm computed by MATLAB. Well, I tried almost everything I know to overcome the problem, but there is no success so far. What could be the reason of this problem?

Any pointers are appreciated.

Best,

Umut

Link Copied

19 Replies

Sergey_M_Intel2

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
12:29 AM

116 Views

While it is difficult to answer without actual dataset, the typical reasonfor accuracy loss in such operation is thecatastrophic cancellation. It happens when t and alpha*v have the same sign and are very close to each other.

Known recipes are as follows:

1) The use of fused-multiply-add (FMA) - it ensures no intermediate roundings, and hence the most accurate results. Downside could be significant performance penalty if the underlying hardware doesn't support FMA

2) Compute alpha * v with some extra precision, e.g. using Dekker's algorithm for exact FP multiplication http://proval.lri.fr/gallery/Dekker.fr.html

More details about cancellation can be found, for example, here http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Please let me know if that helps,

Sergey

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
01:11 AM

116 Views

Quoting Sergey Maidanov (Intel)

While it is difficult to answer without actual dataset, the typical reasonfor accuracy loss in such operation is thecatastrophic cancellation. It happens when t and alpha*v have the same sign and are very close to each other

Known recipes are as follows:

1) The use of fused-multiply-add (FMA) - it ensures no intermediate roundings, and hence the most accurate results. Downside could be significant performance penalty if the underlying hardware doesn't support FMA

2) Compute alpha * v with some extra precision, e.g. using Dekker's algorithm for exact FP multiplication http://proval.lri.fr/gallery/Dekker.fr.html

More details about cancellation can be found, for example, here http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Please let me know if that helps,

Sergey

Dear Sergey,

For 1)

I used the -fma option to the compiler but it did not make any significant changes. Well, my cpu is an Intel Core 2 duo which is not that new, so I am not sure if the fma is supported on this cpu. However, it also did not slow down the program that much, at least timings are still fine. My complete options to compiler are

icpc -Wall -g -DNDEBUG -O3 -xhost -fp-model extended -prec-div -prec-sqrt -fp-speculation safe -fma -no-fp-port -mp1 -pc80

For 2)

Well, I am not sure if I understood you completely, the algorithm given multiplies two numbers right? So are you proposing to use this algorithm to multiply each element of v by alpha?

Thanks for your interest, because I am really stuck for a long time, but at least now, I know the source of the problem is vector subtraction.

Some extra information for accuracy loss: I looked at my algorithm and some points where accuracy could be lost in addition to the above:

a.) I am scaling vectors, with reciprocals, such as using dscale in mkl, with '1.e0/scalar', can this be a problem, I am guessing that you would scale the vector with the above multiplication. For instance a C++ template is, mkl_scale( vec, 1.e0/scalar ) which calls dscale in mkl.

b.) I am using some dot products, and taking the square roots of the results of the dot products as the 'scalar' in (a).

Any ideas are appreciated, Thanks in advance.

Umut

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
03:42 AM

116 Views

You would need to use the 32-bit icpc and specify -mia32 rather than -xhost in order to get the effects of -fp-model extended. As you say, there is not yet an Intel core CPU with fma implementation.

mkl dscale would not in any case take account of -fp-model extended.

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
04:03 AM

116 Views

Thanks for the reply. At least, I seem to make some little progress, what I understood from you is

1.) do not use dscale and do the multiplications as Sergey suggested with -fp-model extended with and 32 compiler. For the moment I use a compiler for 64-bit, does not this mean that the significant is 53 bits by defaults and setting the -fp-model extended flag makes the effect to perform intermediate computations with extenden precision? This is a question that puzzles me.

There is also an 'fmal' in the cmath library. Shall I understand that since my CPU does not support FMA this function is also of no use? Can you comment on this?

2.) mp1: I get no warnings. Should I?

3.) After long trial and errors I understood that my accuracy is lost in the vector subtractions(used in orthogonalizations basically), this is a naive question but do you think changing environment and compiling on a newer system(from cpu point of view) can make a difference?

Best regards,

Umut

Sergey_M_Intel2

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
04:52 AM

116 Views

For 1)

As TimP indicates it's unlikely that -fma will help as the underlying HW doesn't support FMA. What you can do is to try using math library (LIBM) software implementation for fma().

Here's the snapshot from IntelC/C++ compiler documentation:

**fma**

Description: The fma functions return (x*y)+z.

Calling interface:

double fma(double x, double y, double z);

long double fmal(long double x, long double y, long double z);

float fmaf(float x, float y, float double z);

I should say that using this software fma() equivalent will certainly have negative performance impact since your system doesn't have HW support for FMA.For 2)

Under assumption thatthe problem is due to catastrophic cancellation, your source of the problem is in the lack of extra precision when you performs multiply alpha*v

The simplest way to accomplish this is to use long double type. You will also need to use /Qlong-doubleand /Qpc80switches to enable 80-bit FP.

Based on additional information from you I see that Dekker's algorithm implementation will be the challenge, since alpha by itself is the result of a) and b). Try first the simplest method.

SUMMARY:

1) Try fma() math library function

2) Try to switch to long double to ensure alpha and alpha*v

Regards,

Sergey

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
06:10 AM

116 Views

I hope I am converging. Thanks a lot for the detailed overview. There is another question from my side. I call Intel MKL routines from a C++ code, where the matrices are MTL4 matrices however they give direct pointers to internals of the matrices. I can template the matrices with 'long double's however the MKL routines accept doubles only. Does this flag -pc80 for linux take care of this problem as well?

I will let you know about the results.

Thanks a lot for the comments.

Umut

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
06:17 AM

116 Views

You said originally you were interested in float data types. If that were the case, double normally would give you plenty of protection for extended precision.

Sergey_M_Intel2

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
06:21 AM

116 Views

Unfortunately the answer is negative. -pc80 switch affects x87 code only. MKL routines typically rely on SIMD (SSE2/3/4/AVX, etc) which is not controlled by -pc80 switch.

My assumption is that you don't need to rewrite everything in long double. You need long double in relatively isolated code only.

Before going to long double option have you tried the use of fma() from LIBM? This is probly the least effortful thing to try.

Yes, please let me know your results.

Regards,

Sergey

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
06:27 AM

116 Views

I am not sure if I told I used float, I generally always use doubles and that is what I do here as well. My data definitions are all doubles. So what Sergey proposed is obsolete now, unfortunately.

Edit for Sergey,

I am looking into fma stuff however there is sth that puzzles me, should I use fma in long double right, which is logical however my vectors are represented in 'double' s since MKL expects double pointers. Now the question is that how to conform this two if that is what I have to do?

Any further comments?

Best regards,

Umut

Edit

So here is a bit more information where these two subtraction operations work

1.) Mathematically, I need to do

t = t - alpha*v1 - beta*v2;

In mkl sense, I chain two vector subtractions in a sense that

t = t-alpha*v1;

t = t-beta*v2;

considering the order of left to right association of C/C++.

2. ) I need another projection of the form

t = t - X X^T M t

where X is a dense marix of m X n and M is a sparse matrix of M m X m. What I am doing is also chain the

XX^TMt

as

a.) use sparse symmetric matrix vector multiplication for Mt, and

b.) use dense matrix dense vector multiplication, transposed, for X^T(Mt)

c.) use dense matrix dense vector multiplication for X( X^T(Mt) )

d.) then do the subtraction

So basically these are the points where vector subtractions are directly involved.

I tried fma function in double format with a template like this

template

int fma_axpy( vector& v1,

vector& v2,

double alpha )

{

int m = size(v1), k;

for(k=0;k

return 0;

}

to simulate v1 = v1 + alpha * v2(with a negative alpha while calling the function)

in the subtractions in 1.) and 2.) d.) as explained above, it did not result in an improvement.

Best regards and thanks for the help,

Umut

Sergey_M_Intel2

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
07:04 AM

116 Views

[ Assumption is that alpha and v[] are reasonably accurate (more than half bits of mantissa are accurate) ]

Regards,

Sergey

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
08:14 AM

116 Views

Edit: here are two data sets from C++ and MATLAB

C++

7.5193727503693637e+01 4.8828125226718058e-04 4.9239756144253149e-13

MATLAB

7.5193727359034170e+01 4.8828125132781796e-04 1.1648769165048053e-12

First col is the vector norm of a linear system solution

Ax = Mq; // where Mq is a rhs coming from a matrix vector product M*q,

2nd col is

alpha = dot(x, Mq);

and 3rd col is the norm of x_hat given as

x_hat = x - alpha*q

In the first iteration the process starts the diverge for this problem and for other problems a bit later...

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
09:33 AM

116 Views

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
09:41 AM

116 Views

I do not understand what you mean now, I never used floats...

Sergey_M_Intel2

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
10:58 AM

116 Views

Formally, we need to be really careful when we say 'accurate' or 'inaccurate'. WhatI getis that MATLAB and C++ results significantly deviate. It doesn't necessarily mean that C++ code is inaccurate. It also doesn't necessarily mean that MATLAB code is inaccurate. It fairly might be that both are inaccurate. The reason for that could easily be the nature of floating-point arithmetic (catastrophic cancellation) - when computation is numericallysensitive to FPformat(which is the case when catastrophic cancellation happens)then minor deviations in how computation is done may result in significantly different answers.

The first recommendations don't seem help. So the next step that I suggest is to provide us the test case (C++ might be enough as the next step) so that we can reproduce what you see.

Based on the provided test case I will try to do the numerical analysis of the computations and tell you where the inaccuracy comes from.

By the way, when you use original C++ code and fma() based one do the results deviate significantly?

Regards,

Sergey

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
11:52 AM

116 Views

I checked that MATLAB code is more accurate because I can do some structural engineering checks that some values should not be as highly accurate as they should be. Having that said I know that my C++ vectors are. unfortunately not that accurate, in the sense to satifsy this simple check so in brief MATLAB vectors seem to be more closer. This seems like 99 percent true at the moment.

I can send you a personal short report if you like with some resutls that I get.

And for fma results yes.

I deeply appreciate your efforts.

Best regards,

Umut

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
11:21 PM

116 Views

would you please give us, via private thread, the test example and also results you have.

I will pass all of these details to Sergey who will work with this issue.

--Gennady

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-19-2012
11:28 PM

116 Views

Thanks for the reply.

It is not that easy to send you a test code since that is a combination of many libraries. However, to reproduce the problem, I can send you the vectors in the expression

t = t - alpha*v

where I think the problem starts. I can send the vectors t and v written as binary files, also alpha.

So you can check with these input I think. Where shall send these files?

Best regards,

Umut

Victor_K_Intel1

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-20-2012
02:28 AM

116 Views

Do I understand you correctly that you solve the system of equations Ax=Mq by calling MKL and with help of Matlab. Then, you calculate alpha once again in Matlab and by calling ddot, and last, make vectors subtraction independently. If so, this explains the difference inf in first column numbers where ||x|| is shown.

But also, if the above scenario is correct the problem description you started with is a bit incorrect. The expression t=t-alpha*v is computed with slightly different t, aplha and v.

The last column of your example shows that x and q are almost proportional and this is a reason why orthogonal vector cannot be found reliably. BTW, should not the last equation go like x_hat=x-alpha*Mq?

I beleive the algarithm you implement works under assumption that x_hat!=0 and for the case when this vector is 0 it should switch to special branch or stopping criterion should be applied. Please, don't expect the vector to be exactly equal to 0 - all you see is just influence of floating point errors. And the difference between 10^-13 and 10^-12 does not look catastrofic.

Thanks

Victor

utab

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-20-2012
04:08 AM

116 Views

Good catch.

I also realized this yesterday by writing the problem again and preparing data for Gennady Fedorov, it took also some time for me to realize because there are two sides of the problem. One is the structural and the other is the fluid part. The problem does not appear for the structural side but the fluid side.

Apparently, solving K_hat x = Mq and getting x with q = unit_vector / sqrt(unit_vector^T*M*unit_vector) (Btw, this is a Lanczos procedure, where q is the first Lanczos subspace vector and K_hat = K+sigma*M, sigma a proper shift for the eigenvalue problem)

results in a x vector which is a multiple of q and this leads to a problem in the orthogonalization because vectors are collinear. So there is nothing to orthogonalize. This is also evident from the norm as you realized that it is small enough.

I am trying to understand now why they somehow become proportional after the first linear equation solution phase.

Best regards,

Umut

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.