Detecting Floating Point Exceptions

davidrobb · ‎09-12-2006

We are writing a cross platform library consisting of a mixture of optimised C routines and and IPP math functions.

We wish to be able to detect whether various floating point exceptions have occurred, namely FPE_INVALID, FPE_DIVBYZERO and FPE_OVERFLOW.

Browsing on the net suggests that we should be able to use the services provided by to provide this. This works well in most cases, but we run into problems when we call various IPP routines. These appear to mask any results that we might expect to find via the fegetexception() calls. One solution is to interpret the IPP return status and raise the appropriate exception. e.g. raise FPE_INVALID when ippsSqrt returns ippStsSqrtNegArg. This OK for a number of cases but breaks down with ippsExp_32f_I() which does not return a code for overflow ( example source below). Checking the fixed precision versions of the same I see I can call ippsExp_32f_A23() and get the overflow reported via ippStsOverflow (and strangely the FPE_OVERFLOW flag is set already). However, this routine does not appear to have an in-place flavour. Can I call it with src and dest pointing to the same thing?

Is there a recommended method for achieving what I want?

I have had a brief look at handling the exceptions using sigaction() but have been quickly scared off when discovering that doing so would require manual dissasembly of the code causing the exception and manipulation of the PC counter to recover.

Code snippet below:-

#include 
#include 

#include "ipp.h"

void printFPE()
{
  if( fetestexcept(FE_OVERFLOW))
  {
     printf(" Floating point overflow
");
  }
  if( fetestexcept(FE_INVALID))
  {
     printf(" Floating point invalid
");
  }
  if( fetestexcept(FE_DIVBYZERO))
  {
     printf(" Floating point div by zero
");
  }
}

int main()
{
// #pragma STDC FENV_ACCESS ON

   feclearexcept( FE_ALL_EXCEPT);
   float f = 90.0f;
   const IppStatus st = ippsExp_32f_I( &f, 1);

   printFPE();

   printf(" f is %g.  Ipp returned %d
", f, st);

// #pragma STDC FENV_ACCESS OFF
}

Vladimir_Dudnik · ‎09-15-2006

Hello

please take a look on our expert's comment

IPP does not mask any results. In case of Exp_32 it is known that all input values
above 88.72284
and below -87.33654
lead to overflow and underflow therefore there is no any reason to do calculations for these inputs.

Note, that processing of FP exceptions in FPU is very slow, you will loose performance benefits provided by IPP in that case.

Regards,
Vladimir

davidrobb · ‎09-16-2006

Many thanks for the response, but this raisesa few further questions:-

Presumably, you only pay the performance penalty if the exceptions do occur and IPP chooses to threshold the values and explicitly set 0 or inffor values outside the bounds you mention.

99.99% of our datashould be within range so really we just want to flag that at least one item was out of range on the rare occasion that this does happen.

The ippsExp_32f_A24() routine does appear to allow us to do this. However, it does not mention whether we can call it with in place data i.e. p_src == p_dest. It appears to work with the one test I did but can I guarantee that it will work in the future?

Regards,

David

Vladimir_Dudnik · ‎09-19-2006

Hello,

there is comment from our experts

all fixed accuracy math functions (including

ippsExp_32f_A24()) support in-place prcossing, and during the product validation we test for in-place operations. Thankyou for bringing this documentation issue up. We will try to make the documentation clearer re the in-place support in fixed accuracy math functions.

Regards,
Vladimir

davidrobb · ‎09-19-2006

Hi Vladimir,

Thanks for that. That's good news about the in-place functionality. We should have a way forward with detecting the FP exceptions in that case.

As an aside, I'm now puzzled by the behaviour of the behaviour of std::sqrt( float f). This produces a NaN and FP_INVALID for sqrt( inf) but a NaN and no exception for sqrt( -1.0f)!

Also, a 1.0f / 0.0f produces FP_INVALID rather than FP_DIVBYZERO when vectorised!

I need to do some more testing to establish whether it's just with this P4 SSE3 CPU or common to other CPU types.

Questions to go and tackle the compiler writers with rather than here!

Regards

Dave Robb