- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I have a CFD code. When I compile it with GFortran, I usually use some flags "-O3 -march=x86-64-v2 -ffpe-trap=invalid,zero,overflow -g -fbacktrace". Floating point trapping is extremely useful when debugging problems, and this works great. If there is a divide by zero or usage of Inf or NaN, the program stops and let me know where the problem is.

For those not familiar with GFortran, it does by default set "-ffp-contract=fast", this allows optimizations that violate floating point semantics, such as FMA. However, by default is does not set "-ffast-math", and it does not generate any instructions that break floating point exception trapping.

Performance matters very much to me, and I do not care if the compiler choose more efficient and less precise math function implementations, evaluate expressions in different orders and other optimizations that changes the last decimals of my answers. But I still want the floating point exception traps to work!

Now how to realize this is IFX?

I tried to read up on the IFX documentation for -fp-model. It is clear that it sets "-fp-model=fast" by default. If i turn on "-fpe0" I get lots of false positive floating point errors, so my program does not work. I tried "-fp-model=strict", which gives me a **huge** performance penalty. I believe the "strict" floating point mode is way stricter than it needs to be. As previously mentioned, I am fine with additional round-off errors, re-ordering, fma, etc.

Can anyone give me any hints towards flags that can give best possible performance, yet still not generate invalid floating point numbers in IFX?

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Until you get an answer and/or fix....

Object files are interoperable between ifort and ifx.

First possible (interim) work around: Compile the main (PROGRAM procedure) with ifort using the -ffpe... option, and compile the remainder using ifx. This is under a presumption that the FP error state is set once at program initialization.

Second possible work around, if FP error state is set (reset) elsewhere, thus negating first work around, compile potential problem-some files using ifort and the remainder with ifx.

Third possible work around:

Using ifx alone, make use of IEEE_EXCEPTIONS and to manipulate the exception conditions.

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The problem is not to set "-fpe0" in IFX, the problem is that the code IFX generate has false positives and triggers errors in places where it should not be errors. However, I believe this is in the "spirit" of the "-fp-mode=fast".

Consider the following example:

```
DO i = 1, icells
nxi = area(1, i)
nyi = area(2, i)
nzi = area(3, i)
nn = SQRT(nxi**2 + nyi**2 + nzi**2)
IF (nn > TINY(1.0)) THEN
nvecs(1, i) = nxi/nn
nvecs(2, i) = nyi/nn
nvecs(3, i) = nzi/nn
ELSE
nvecs(1, i) = 0.0
nvecs(2, i) = 0.0
nvecs(3, i) = 0.0
END IF
END DO
```

with "-O3 -xSSE4.2" and nothing else the square root is compiled into the "rsqrtps" instruction, which in one instruction compute the reciprocal of the square root 1/sqrt(x) and that saves three divisions later. This is of course a very good optimization and exactly the things I expect "-fp-model=fast" is doing.

The problem occur when area, and then nxi, nyi, nzi are zero, then the argument to the square root function is zero. This is perfectly valid, square root of zero is zero. The code that follows is also perfectly valid, it takes into account that the zero might appear and skips the divide by zero in that case. So the code is OK.

However, since ifx aggressively compile the square root into "rsqrtps", i.e. the reciprocal, we have a 1/0, the result of the rsqrtps is Inf and the resulting operation raises a flating point error, which I think is also perfectly valid, the result is actually Inf...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Consider:

```
DO i = 1, icells
nxi = area(1, i)
nyi = area(2, i)
nzi = area(3, i)
nn = nxi**2 + nyi**2 + nzi**2
IF (nn > 0.0) then
nn = SQRT(nn) ! note sqrt(TINY(1.0)) ~= 1.0842022E-19
ELSE
nn = HUGE(1.0) ! force n?i / nn to 0.0
ENDIF
nvecs(1, i) = nxi/nn
nvecs(2, i) = nyi/nn
nvecs(3, i) = nzi/nn
END DO
```

I suspect the above (untested) code will not generate the rsqrtps (but you should check).

Jim Dempsey

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

By the way, instead of producing a unit vector of [0.0, 0.0, 0.0] consider if your results might be better served by producing a random unit vector. For example, a collision of particles will rebound in some arbitrary direction (conserving momentum) as opposed to collecting into a point location (not conserving momentum).

Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page