IppsSqrt_32f_I doesn't like denormals.
In version 5.2, it all worked fine. Starting from 5.3 (still in 6 beta), denormals are turned into -infinite, even though the function returns 0 as no error.
We can of course filter out denormals first, but I think the function should be able to handle them for compatibility.
This problem raises another, btw.
My thread where I call this Sqrt normally has FTZ & DAZ modes set. So if this Sqrt function returns -Inf for them, it can mean one of these:
-that function switches the FTZ or DAZ off (I doubt)
-that function performs in integer (most likely)
-that function has a special case for denormals (& it's then buggy)
-starting from 5.3, it would use a side thread (in which it wouldn't set the FTZ or DAZ), even though you'd tell IPP that you only want 1 thread (so I'd expect it to run in the same thread). Thiswould be strange, but in the 5.3 threaded version, I get a big performance hit as mentionned in another thread, which seems to show that it really does perform in *1* side thread.
Now, in the multithreaded version, do each thread get the SSE flags of the calling thread? That is, if the calling thread has FTZ & DAZ set, does it mean that each of the threads will have it to, OR do I have to call ippsSetFlushToZero to make sure that all of its thread are properly set up? This may explain the performance hit I get using the multithreaded version.