IppsSqrt_32f_I doesn't like denormals.
In version 5.2, it all worked fine. Starting from 5.3 (still in 6 beta), denormals are turned into -infinite, even though the function returns 0 as no error.
We can of course filter out denormals first, but I think the function should be able to handle them for compatibility.
Link Copied
This problem raises another, btw.
My thread where I call this Sqrt normally has FTZ & DAZ modes set. So if this Sqrt function returns -Inf for them, it can mean one of these:
-that function switches the FTZ or DAZ off (I doubt)
-that function performs in integer (most likely)
-that function has a special case for denormals (& it's then buggy)
-starting from 5.3, it would use a side thread (in which it wouldn't set the FTZ or DAZ), even though you'd tell IPP that you only want 1 thread (so I'd expect it to run in the same thread). Thiswould be strange, but in the 5.3 threaded version, I get a big performance hit as mentionned in another thread, which seems to show that it really does perform in *1* side thread.
Now, in the multithreaded version, do each thread get the SSE flags of the calling thread? That is, if the calling thread has FTZ & DAZ set, does it mean that each of the threads will have it to, OR do I have to call ippsSetFlushToZero to make sure that all of its thread are properly set up? This may explain the performance hit I get using the multithreaded version.
Thanks for you information!
The bugin ippsSqrt_32f_I will be fixed in next verison.
But in any case the functionperfomance drops if input dataare denozmalized.
thanks
But in any case the functionperfomance drops if input dataare denozmalized
normally not if FTZ or DAZ are on, no?
You are right. Usually they helpto improve perfomanceof calculationfor denormalize data. But it isn't for this case.
For more complete information about compiler optimizations, see our Optimization Notice.