Long double precision calculation

jambulat · ‎04-08-2011

Hello!

I tried to use Intel C++ compiler because it seems possible to perform calculations with more than double accuracy. But simple test shows that no accuracy rising when long double is used. Of course, my code was compiled with /Qlong_double key. Just look at it:

#include
#include "stdio.h"

template
T f(T x)
{
return x*x*x/3;
}

template
T diff_f(T x)
{
const T eps = 1e-45;
T dx = 1e-4;
T fp_prev, fp_curr;

fp_curr = (f(x+dx) - f(x))/dx;

do{
dx /= 2;
fp_prev = fp_curr;
fp_curr = (f(x+dx) - f(x))/dx;
} while(abs(fp_curr-fp_prev) > eps);

return fp_curr;
}

int main()
{
//float x = 1.0;
double x = 1.0;
//long double x = 1.0;
printf("f'(%.2f)-1=%e\\n",(double)x,(double)(diff_f(x)-1));
printf("Press enter...");
getchar();
return 0;
}

It calculates derivative of x^3/3 function at the point x=1. When used type float difference between calculation and the real answer is 1.36e-3, for double - 4.75e-8, and for long double - also 4.75e-8. Could you please explain me why there is no increasing of accuracy?

mecej4 · ‎04-08-2011

This is a classic question in numerical analysis.

There are two sources of error in computing a derivative using a forward difference approximation:

1. Truncation (of Taylor series) error:

f'(x) = [f(x + h) - f(x)]/h + O(h) . f''(x)

where O is Landau's big-O notation.

and

2. Machine round-off error, which is f(x)O()/h from the above expression, where is the machine epsilon, assuming that function evaluations are performed to full machine precision (not always true).

We may, in fact, insert a factor of 2 since f(x + h) could be rounded up and f(x) rounded down.

Putting the two pieces together, you see that the total error in f'(x) is of the order of

f'' h + f / h

give or take a multiplier of O(1) in the second term.

This result shows a notable property: simply reducing h, or reducing by moving from single precision to double precision, is not necessarily going to improve the numerical approximation. In fact, there exists an optimum h corresponding to the machine epsilon for each precision selected. If h is too large, the truncation error is large. If h is too small, the machine round-off error is large. The optimum h is of the order of ( f / f'') .

If this property is overlooked and h is made too small, you will be left with the absurd result that the derivative is zero.

Your convergence test while(abs(fp_curr-fp_prev) > eps); is, for the reasons listed above, incorrect. You can never obtain convergence to the level of precision that you asked for : 1E-45, even with quadruple precision. Secondly, it can happen that | fp_curr -fp_prev | even becomes zero, but the "converged" value is still inaccurate.