- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, everyone,
I have a double-precision code, which I am sure it is right. while I upgrade it to quad-precision, the output are wild, What is wrong? Any idea?
I have a double-precision code, which I am sure it is right. while I upgrade it to quad-precision, the output are wild, What is wrong? Any idea?
Link Copied
17 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Without seeing an example, my first guess is that you have an argument type mismatch when calling a procedure. Try building with "-gen-interface -warn interface".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, Steve,
I tried, but the output is still bad.
My compile options is
ifort -O0 -g -fp-model strict -real-size 64 -integer-size 16 -mcmodel=medium -shared-intel -gen-interface -warn interface a.f90
Include the interface option, as you advised.
And in terminal, I set
ulimit -s 1024000
export KMP_STACKSIZE='1024000K'
By the way, all the math functions in the quad-precion code, use quad-precision, such as qsqrt, instead of dsqrt.
Quoting - Steve Lionel (Intel)
I tried, but the output is still bad.
My compile options is
ifort -O0 -g -fp-model strict -real-size 64 -integer-size 16 -mcmodel=medium -shared-intel -gen-interface -warn interface a.f90
Include the interface option, as you advised.
And in terminal, I set
ulimit -s 1024000
export KMP_STACKSIZE='1024000K'
By the way, all the math functions in the quad-precion code, use quad-precision, such as qsqrt, instead of dsqrt.
Quoting - Steve Lionel (Intel)
Without seeing an example, my first guess is that you have an argument type mismatch when calling a procedure. Try building with "-gen-interface -warn interface".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You should be using the generic sqrt instead. Specific intrinsic names are for those stuck in Fortran 66.
Care to show an example?
Care to show an example?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'd be a little worried about integer-size 16 with real-size 64, although it ought to be OK.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All the functions specified for quad-precision, were changed to the general ones, but still keep quad-precision specifications for real numbers, such as 1.0Q0
Unfortunately, I am lazy, use
implicit real(kind=16) (a-h,o-z), integer(kind=8) (i-n)
at the beginning of every subroutine. Will that be a cause?
Now compiler with
ifort -O0 -real-size 128 -intger-size 64 -zero -mcmodel=large -shared-intel a.f90 as suggested by Tim.
But Still Not works.
God, it has been two weeks for this bug.
Quoting - Steve Lionel (Intel)
Unfortunately, I am lazy, use
implicit real(kind=16) (a-h,o-z), integer(kind=8) (i-n)
at the beginning of every subroutine. Will that be a cause?
Now compiler with
ifort -O0 -real-size 128 -intger-size 64 -zero -mcmodel=large -shared-intel a.f90 as suggested by Tim.
But Still Not works.
God, it has been two weeks for this bug.
Quoting - Steve Lionel (Intel)
You should be using the generic sqrt instead. Specific intrinsic names are for those stuck in Fortran 66.
Care to show an example?
Care to show an example?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please provide us a test case so that we can see what the problem is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please provide us a test case so that we can see what the problem is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - wsec303
http://software.intel.com/en-us/forums/showannouncement.php?a=79
and upload a.f90 and tell us how to run it, and what you expect the output to be.
OR you can open a problem report at http://premier.intel.com and upload the file there, if you don't want the whole world to see it.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, but it is required to keep the code confidential. No offend.
One issue, that just jumps out of my head, is the double precision code made some modifications to avoid overflow.
The modifications are something like when the real number reaches the limit, then give the real number a specific value that below the limit. Will that be the cause?
Or the problem is caused by the accumulation of round-off errors?
How to debug efficiently using idb, to compare the double-precision code running results with the quad-precision code running results, to digg out where the problem is?
The code consists of a lot of subroutines, so if you can tell me how to debug efficiently and speculate the running outputs, maybe help a lot.
Quoting - Ronald Green (Intel)
One issue, that just jumps out of my head, is the double precision code made some modifications to avoid overflow.
The modifications are something like when the real number reaches the limit, then give the real number a specific value that below the limit. Will that be the cause?
Or the problem is caused by the accumulation of round-off errors?
How to debug efficiently using idb, to compare the double-precision code running results with the quad-precision code running results, to digg out where the problem is?
The code consists of a lot of subroutines, so if you can tell me how to debug efficiently and speculate the running outputs, maybe help a lot.
Quoting - Ronald Green (Intel)
Steve has the instructions at the top of of the Forum, it is here:
http://software.intel.com/en-us/forums/showannouncement.php?a=79
and upload a.f90 and tell us how to run it, and what you expect the output to be.
OR you can open a problem report at http://premier.intel.com and upload the file there, if you don't want the whole world to see it.
ron
http://software.intel.com/en-us/forums/showannouncement.php?a=79
and upload a.f90 and tell us how to run it, and what you expect the output to be.
OR you can open a problem report at http://premier.intel.com and upload the file there, if you don't want the whole world to see it.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What I usually do is add WRITE statements in various places that write out intermediate values using the same format. I'll also add some sort of tag in the line so I can tell where it's coming from. I then run the program the "old" way and then the "new" way and compare the output files to see where the results start to diverge.
I usually need to refine this - once I've identified a section of code that creates the difference, I start writing out more detail until I find the specific operatiion and values that is the culprit. Then and only then can I determine what the problem is. I find this method easier than using the debugger.
If you can't provide us the code, then you'll have to do this on your own. There is no substitute for analysis.
Note that if you have a license which entitles you to support, your submissions to Intel Premier Support are kept confidential. Whatever happens, please let us know how it turns out.
I usually need to refine this - once I've identified a section of code that creates the difference, I start writing out more detail until I find the specific operatiion and values that is the culprit. Then and only then can I determine what the problem is. I find this method easier than using the debugger.
If you can't provide us the code, then you'll have to do this on your own. There is no substitute for analysis.
Note that if you have a license which entitles you to support, your submissions to Intel Premier Support are kept confidential. Whatever happens, please let us know how it turns out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, it would be perplexing as to how your code could ever have worked. You imply taking log() of a negative number. If you meant abs() or some more careful treatment, you needed to write that in your source code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree with Tim - it is a mystery how this code worked before. Did you have it working in quad precision with the other compilers? Does the double-precision version also try to take logs of negative numbers? It could be that you're ending up with NaNs that are propagating or being dropped later.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry. I provided wrong information.
The bug should be in the code below.
q=cmplx(-1.1,0.0)
q2=q**(0.25)
For quad-precison, the output of q2 is NaN, but db-precision, it works.
I really don't know why.
My solution to this bug is rewritting the code as
q=dcmplx(-1.1,0.0)
q2=qcmplx(q**0.25d0)
The bug should be in the code below.
q=cmplx(-1.1,0.0)
q2=q**(0.25)
For quad-precison, the output of q2 is NaN, but db-precision, it works.
I really don't know why.
My solution to this bug is rewritting the code as
q=dcmplx(-1.1,0.0)
q2=qcmplx(q**0.25d0)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - wsec303
Sorry. I provided wrong information.
The bug should be in the code below.
q=cmplx(-1.1,0.0)
q2=q**(0.25)
For quad-precison, the output of q2 is NaN, but db-precision, it works.
I really don't know why.
My solution to this bug is rewritting the code as
q=dcmplx(-1.1,0.0)
q2=qcmplx(q**0.25d0)
The bug should be in the code below.
q=cmplx(-1.1,0.0)
q2=q**(0.25)
For quad-precison, the output of q2 is NaN, but db-precision, it works.
I really don't know why.
My solution to this bug is rewritting the code as
q=dcmplx(-1.1,0.0)
q2=qcmplx(q**0.25d0)
This explains why your program has worked in the past -- you weren't really taking a negative number to a real power, just a complex number that looks something like a negative number.
1. If the statements above do what you say in a small (<10 line) program, you can submit such a program in a bug report to Intel without compromising the confidentiality of the original code.
2. Another alternative you could try, either as a workaround or to further study the bug, is
q=cmplx(-1.1,0.0)
q2=exp(0.25*log(q))
This latter expression is what q**(0.25) should translate into, but by writing it explicitly, you may either get the right answer, or be able to determine whether it is the log or the exp that is producing the NaN.
3. If your actual code really looks like these snippets, I should point out that although you are computing in quad or double precision, you are only producing results correct to single precision levels, for a couple of reasons:
a. 1.1 is a single precision constant. Converting it to quad precision later doesn't eliminate the error inherent in its value. If you are using a constant, you need to say something like -1.1q0 or -1.1_quad (provided you define quad to be a constant with the KIND value for quad procision).
b. Unless you specify the optional KIND argument, the cmplx intrinsic only produces a single precision result. Instead of what you have written, you might want to do something like
q=cmplx(-1.1q0,0.0q0,kind=kind(q))
c. I note in passing that nominally you have the same problem with 0.25 as you have with 1.1. In practice, however, 0.25 is exact in binary and thus can be converted between single precision and quad precision without loss of accuracy, unlike 1.1 which can only be approximated in binary and thus has less accuracy in single precision than quad precision.
4. If q really is defined by a constant, I would use a literal complex constant such as (-1.1q0,0.0q0) rather than using the cmplx intrinsic. [I would do this both for better readability and because some compilers might actually be stupid enough to execute the cmplx at run-time instead of at compile-time.]
If it is not constant, but the imaginary part really always is zero, I would omit the second argument rather trying to provide an explicit zero of the appropriate kind. [This would primarily be for readability, but also to avoid the opportunity for errors by me or the compiler in the interpretation of an explicit imaginary part.]
-Kurt
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - hirchert
q2=exp(0.25*log(q))
-Kurt
As Kurt mentions, it is somewhat curious that you would use single precision constants in quad arithmetic. By invoking several direct conversions from single to quad precision, you may have found something which hasn't been tested adequately.
If the argument were not complex quad, sqrt(sqrt(q)) could be expected to be more accurate and/or faster than q**.25. If it's important, it should be tested for this case.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - hirchert
This explains why your program has worked in the past -- you weren't really taking a negative number to a real power, just a complex number that looks something like a negative number.
1. If the statements above do what you say in a small (<10 line) program, you can submit such a program in a bug report to Intel without compromising the confidentiality of the original code.
2. Another alternative you could try, either as a workaround or to further study the bug, is
q=cmplx(-1.1,0.0)
q2=exp(0.25*log(q))
This latter expression is what q**(0.25) should translate into, but by writing it explicitly, you may either get the right answer, or be able to determine whether it is the log or the exp that is producing the NaN.
3. If your actual code really looks like these snippets, I should point out that although you are computing in quad or double precision, you are only producing results correct to single precision levels, for a couple of reasons:
a. 1.1 is a single precision constant. Converting it to quad precision later doesn't eliminate the error inherent in its value. If you are using a constant, you need to say something like -1.1q0 or -1.1_quad (provided you define quad to be a constant with the KIND value for quad procision).
b. Unless you specify the optional KIND argument, the cmplx intrinsic only produces a single precision result. Instead of what you have written, you might want to do something like
q=cmplx(-1.1q0,0.0q0,kind=kind(q))
c. I note in passing that nominally you have the same problem with 0.25 as you have with 1.1. In practice, however, 0.25 is exact in binary and thus can be converted between single precision and quad precision without loss of accuracy, unlike 1.1 which can only be approximated in binary and thus has less accuracy in single precision than quad precision.
4. If q really is defined by a constant, I would use a literal complex constant such as (-1.1q0,0.0q0) rather than using the cmplx intrinsic. [I would do this both for better readability and because some compilers might actually be stupid enough to execute the cmplx at run-time instead of at compile-time.]
If it is not constant, but the imaginary part really always is zero, I would omit the second argument rather trying to provide an explicit zero of the appropriate kind. [This would primarily be for readability, but also to avoid the opportunity for errors by me or the compiler in the interpretation of an explicit imaginary part.]
-Kurt
Thanks. You advise really helps.
One quick question: where you learn that A**b, is translated into exp(b*log(A)) in Ifort.
I am not questioning that, just want to know where you find this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - wsec303
Quoting - hirchert
2. Another alternative you could try, either as a workaround or to further study the bug, is
q=cmplx(-1.1,0.0)
q2=exp(0.25*log(q))
This latter expression is what q**(0.25) should translate into, but by writing it explicitly, you may either get the right answer, or be able to determine whether it is the log or the exp that is producing the NaN.
-Kurt
q=cmplx(-1.1,0.0)
q2=exp(0.25*log(q))
This latter expression is what q**(0.25) should translate into, but by writing it explicitly, you may either get the right answer, or be able to determine whether it is the log or the exp that is producing the NaN.
-Kurt
Thanks. You advise really helps.
One quick question: where you learn that A**b, is translated into exp(b*log(A)) in Ifort.
I am not questioning that, just want to know where you find this?
I was making an assertion about mathematics, not ifort specifically, so we're talking about something I learned in high school. 40+ years later, I don't remember exactly where, but I suspect it may have been in my calculus class.
It is my experience that for non-integer b, every Fortran compiler I've ever used computes A**b effectively this way. Some have literally used the log and exp intrinsics; others used slightly tweaked equivalents to improve speed, accuracy, and/or range. [E.g., because the internal arithmetic of the computer is done base 2, the actual computation may be a bit of a cross between exp(b*log(A)) and 2**(b*log2(A)).] I have no specific knowledge of what ifort uses, so my suggestion allowed that this workaround might either give you the right answer (if the bug was entirely in a tweaked implementation of A**b) or it might provide insight into which half of the computation had the bug (if ifort actually uses the exp and log intrinsics or if the code in its A**b implementation is copied the buggy code from one of them).
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page