- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I tried to assign a variable defined as real(8) to another one defined as real(16) just by
b=a
However, it is not what I expected. For example,
a=2.48740685923698
then
b=2.48740685923698290338279548450373
However, I wish b be equal to
2.48740685923698000000000000000000
exactly.
Could anyone tell me how to assign the value of a to b?
Thanks,
Zhanghong Tang
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The simple assignment of a REAL(8) to a REAL(16) makes the binary values exactly equal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank both of you very much!
The story comes from the following calculation:
I want to solve matrix A*x=b.
I get two approximate solution: x and x' which satisfy the following equations:
b-A*x=r, ||r||
r-A*x'=r', ||r'||
Then I think the solution
xnew=x+x'
is the higher precision solution since
b-A*(x+x')=r' and
||r'||
However, when I calculate the value ||b-A*xnew||, the value is still about eps1*b, not what I wished eps1*eps2*b, furthermore, I verfied two things:
||b-A*x-A*x'|| is about eps1*eps2*b
||b-A*x'-A*x|| is about eps1*b
So I doubt there are some error when I calculate ||b-A*x|| by real(8) data type, I changed the data to real(16) to calculate ||b-A*x|| after I got the solution x and x' (they are still calculated by real(8) data type). But the result is still not improved. Then I found when I assign a real(8) data to a real(16) data, they are 'not euqal'...
Do you think is it possible to get a higher precision for the current data type?
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Having some noise in your data is very common, and you should try to solve the problem in the way everybody else does: Numerical Analysis. Get an estimate of the condition number, try least squares, perform singular value decomposition, etc... And if all else fails, try some iterative methods (GMRES comes to my mind as a last resort)... But you've already tried all that, haven't you?
John.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Thank you very much for your so detailed reply.
On the contrary, for such problem, some iterative methods can reach toa higher precision solution, i.e., ||b-A*x|| could be reached to eps1*eps2*||b||, except that they needs thousands of iterations, which is too slow to beaccepted.
Then I tried the AMG method, however, I found that after the residual has reached to a small value, such as eps1*||b||, the convergence becomes very slow, or become divergent. So I tried to solve the equation of residual again and got another approximate solution, just as I described previously. I think the solution x+x' has the higher precision. But the result is not what I wanted.
Do you have any idea of this method? Do think whether there are something wrong in my previous formula?
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You said that you obtained two solutions for your problem... Are those solutions being obtained for the same region (e.g., with similar initial guesses)?
Also, combining AMG with another iterative method could help: For example, a few hundred iterations with a preconditioned CG method (e.g., Jacobi) to obtain an initial guess for your MG method.
And, don't try x+x' as your solution only because you think so, unless you have a good explanation for that. The solution might indeed be the linear combination c*x+d*x', but here you're just picking c=d=1 automatically.
John.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's likely to be more practical to work on improving the accuracy of your original solution. Steps in that direction include:
1) LU factorization using vectorized dot products, depending on improved ordering and batching of sums to increase effective precision as well as optimizing speed
2) LU factorization using x87 extended precision dot products (you must set precision mode to 64 bits, if using a compiler or OS which sets 53-bit precision mode).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Thank you again for your answer.
1. both of the solutions are obtained from the same matrix A, from the initial guesses ZERO, except the right hand terms different.
2. I also tried to combine AMG with PCG, CGS and so on, the results are almost the same. The result that is higher precision is obtained from the PCG with ILU(0) preconditioner.
3. In mathematics, the c and d is 1, do you think whether it is possible to get a higher precision solution from x and x', including by the linear combination? Then how to explain the linear combination?
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
Thank you very much for your reply.
Firstly, I think it's hard to further improve the original solution x, so I tried to find another solution based on the residual equation.
For your suggestions for using vectorized dot products, I do not well understand. Currently, the most calculation works are matrix-vector product, vector-vector dot product, I used the MKL's sparse matrix functions to process them. The only one I wrote the code myself is the matrix-matrix-matrix multiplicate, since I have not found any function.
In your second suggestion, what do you mean to set these precisions? Can I set them directly from the project and run without change any code? My running environment is IVF10+MKL9.1+VS.net2005, Windows XP OS.
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
64-bit precision mode would enable more accurate double precision dot products, as a less expensive step than real(16). Changing precision mode presumably violates the calling convention for Microsoft library functions. Given the trend away from use of 32-bit compilers and x87 floating point, these possibilities may not be worth consideration.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
Thank you very much for your kindly reply. But could you please where to set the option /pc80? I set the option and the following error message appears:
ifort: command line warning #10006: ignoring unknown option '/pc80'
LINK : warning LNK4044: unrecognized option '/pc80'; ignored
In addition, do you mean if this option is set, the program will run in 64-bit precision mode without any change of the code? Is the 64-bit precision mode only enable accurate tothe dot products, or all operations?
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In this case x=2 and x=3 are prime numbers (sort of "linearly independent"). Can you guarantee that your x and x' are linearly independent?
Maybe you should just forget about the idea of combining x and x', and stick to Tim's suggestions.
One more thing: Have you tried with b, instead of ZERO, as your initial guess?
John.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi John,
Thanks for your kindly reply again.
In my issue the problem to be solved isa linear system so I can say that xnew=x+x', do you mean if the data are near machine precision, the problem could become nonlinear? That's terrible, I wish it doesn't.
I have tried with b, of course the solution is the same, because everything is the same, the solver, the input A and b, and the initial guess.
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tim,
Thanks for your help! I can build the program with the option /Qpc80 (modified fromboth the .cfg file and the project of the program). But it didn't improve the result.
Is there any other special settings? How to find the difference with and without the option?
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
But again, it would be better for you to abandon the idea of trying to combine solutions to different data... And maybe rechecking the statement of your problem (since precision and faster solvers didn't seem to help).
John.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's not surprising. The processor has a limited number of registers for doing arithmetic in 80 bits so variables are sent to memory, rounded to double(64 bits), and rounded again when they are refetched. You have little control over this activity. The 2 extra bytes extend the precision and range so asto safeguard against under/overflow in intermediate calculations. As this in hardware it is quite fast relative to multiple precision arithmetic.
I agree with John's suggestion that you revisit the formulation of the problem which is presumably physically based and look at conditioning considerations. If it's badly conditioned then this isn't about to change by applying more and more resources at it.
Have you considered using interval or stochastic arithmetical techniques? Also, try posting to the sci.math.numerical-analysis forum for further advise.
Gerry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
re: "However, I wish b be equal to 2.48740685923698000000000000000000 exactly."
& "What you appear to be asking is impossible"
At least for IVF10, would something along the lines of the following not achieve the original intent (albeit "non-portable")?
b = 1.Q-14 * QEXT ( INT( a * 1.d14 ) )
Edit: Or perhaps that needs to be
b = 1.Q-14 * QEXT ( INT( a * 1.d14, 8 ) )
David
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank all of you very much!
I will research my problem again carefully and then report the results to you. My problem is: for a given large sparse matrix (more than 1 million unknowns), all kinds of AMG preconditioned iterative methods can't reach to a give precision (for example, 1.e-10). But some other ILU(0) preconditioned iterative methods still can reach to the precision after thousands of iterations. I don't know what lead to such problem.
Gerry, you mentioned the'interval or stochastic arithmetical techniques', where can I find the introduction of that? is it helpful to solve my problem? By the way, I can't open the forum you said.
Thanks,
Zhanghong Tang
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you're lucky and Gauss-Seidel requires only one million iterations, then you shouldn't complain if the MG-ILU method requires something between 10,000-100,000 iterations (that's 1-10% of the iterations required by GS).
MG is one of the fastest methods available... But don't expect miracles.
John.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page