Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28523 Discussions

Changes to floating point operations since version 10.1.30

mostlyAtNight
Beginner
594 Views
Hello All,

Since moving from version 10.1.30 to version 11.* or the Intel visual Fortran compiler, I've noticed differences in the way my application is behaving with respect to floating point operations.

My application solves simultaneous equations using matrix techniques. The version of my program compiled with version 11 of the compiler is no longer able to find a solution for the same problem which the software can successfully solve when compiled with version 10.1.30

Have any changes been made to the way floating point procedures are handled in version 11 of the compiler?

I've tried playing with a few of the floating point options in the project properties but unfortunately this does not fix the problem.

Any help or suggestions would be much appreciated as at the moment, we are having to make a release of our software using Intel visual Fortran version 10.

Regards,

Pete
0 Kudos
8 Replies
TimP
Honored Contributor III
594 Views
What was the default architecture (no SSE) for the 32-bit ia32 10.1 compiler is selected in 11.x by /arch:ia32
The new default /arch:SSE2 (same for ia32 and intel64) should work more reliably if you set /assume:protect_parens /Qprec-div /Qprec-sqrt or (more conservative) /fp:source. If you depended on implicit promotion of single precision expressions to double, you would need to write it in explicitly.
If I didn't make suitable guesses about what you are doing, please be more specific.
0 Kudos
mostlyAtNight
Beginner
594 Views
Hi tim18,

Thanks for your fast response.

I'll try compiling again with those options and see if it improves things.

You make an interesting point about implicit promotion of single precision to double precision - I'm going to investigate this too.

Thanks again for your help.

Regards,

Pete

0 Kudos
mostlyAtNight
Beginner
594 Views
Hi tim18,

I've just tried those options and /arch:ia32 does the trick.

In this case it seems that I am loosing precision due to the compiler taking advantage of SSE instructions.

Is this a general disadvantage of using SSE or could there be particular areas of my code that need adjusting to minimise the loss of precision?

PS. I also tried the other options you mentioned to improve the accuracy of the SSE instructions but unfortunately these were not able to solve my problem.

Kind regards,

Pete

0 Kudos
TimP
Honored Contributor III
594 Views
Quoting - mostlyAtNight
Hi tim18,

I've just tried those options and /arch:ia32 does the trick.

In this case it seems that I am loosing precision due to the compiler taking advantage of SSE instructions.

Is this a general disadvantage of using SSE or could there be particular areas of my code that need adjusting to minimise the loss of precision?

PS. I also tried the other options you mentioned to improve the accuracy of the SSE instructions but unfortunately these were not able to solve my problem.

With /arch:ia32, single precision expressions are promoted to double, giving more accuracy. With optimization, the promotion may even persist across assignments. SSE code doesn't do that, except where you specify it explicitly, e.g. dble(A)*B+C. The old option /Op does some of this even with SSE, but it's slow. It's possible that changing critical code sections to declared double precision may be in order.
There are a few cases where specifying a more accurate order of expression evaluation may do away with a requirement for extra precision:
A*(B-C) rather than A*B - A*C (Fortran permits a compiler to do this automatically, but don't count on it)
(a+b)*(a-b) rather than a**2 - b**2
(A-B) + (C-D) (with assume:protect_parens) rather than A-B+C-D, if all variables have the same sign
Polynomial evaluation by Horner's rule, with a few minor enhancements:
X + X*X*(A2 + X*A3) (polynomial, A1==1, A0==0)
For a sum of several terms, promotion to double precision often is the best way.
0 Kudos
mostlyAtNight
Beginner
594 Views
Hi tim18,

That explains things nicely - I think we'll be now looking to convert the majority of our code to double precision.

Thanks again for your help.

Regards,

Pete
0 Kudos
Ilie__Daniel
Beginner
594 Views
Quoting - tim18
With /arch:ia32, single precision expressions are promoted to double, giving more accuracy. With optimization, the promotion may even persist across assignments. SSE code doesn't do that, except where you specify it explicitly, e.g. dble(A)*B+C. The old option /Op does some of this even with SSE, but it's slow. It's possible that changing critical code sections to declared double precision may be in order.
There are a few cases where specifying a more accurate order of expression evaluation may do away with a requirement for extra precision:
A*(B-C) rather than A*B - A*C (Fortran permits a compiler to do this automatically, but don't count on it)
(a+b)*(a-b) rather than a**2 - b**2
(A-B) + (C-D) (with assume:protect_parens) rather than A-B+C-D, if all variables have the same sign
Polynomial evaluation by Horner's rule, with a few minor enhancements:
X + X*X*(A2 + X*A3) (polynomial, A1==1, A0==0)
For a sum of several terms, promotion to double precision often is the best way.

Does this mean that if I redeclare all my real(4) variables as real(8), I could take advantage of the SSE2 instructions without the loss of precision?
Does /arch:ia32 promote expresions to double precision, even if they are, let's say, a simple sum of two integer(4) variables?
Does it make sense then, to redeclare all integer(4) to integer(8)?

Kind regards,
Daniel.
0 Kudos
Steven_L_Intel1
Employee
594 Views

First of all, there's no effect on integer operations.

/arch:ia32 will cause some single-precision intermediate operations to be done in double precision, which can, as you find, give inconsistent results. Using SSE instructions minimizes those surprises, but you lose the "extra" precision you had with x87 instructions. If your program needs more precision than what you declared, then sure, use double precision instead.
0 Kudos
TimP
Honored Contributor III
594 Views

First of all, there's no effect on integer operations.


64-bit integers could lose performance in a 32-bit build, without, as Steve said, any direct connection with the other issues you have raised. The only reason I have seen for use of 64-bit integers to go along with promotion of source code from single to double precision is in the case where COMMON alignments demand the same size integers and floats.
0 Kudos
Reply