Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

What happens if DAZ bit is set but isn't supported?

Rich_Skorski
Beginner
424 Views

Hello,

I've been profiling some SSE instructions on our target hardware, and have stumbled into the FTZ and DAZ flags.  Turning on the FTZ flag greatly increases speed, and turning on DAZ  increases it a bit more (for that first instruction that gets denormal input). 

This site is awesome, http://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz, and it notes that the DAZ flag was not supported on earlier hardware.  There's even a link to a document that tells me how to check for DAZ support.  Because of curiosity, I have to ask the question: what happens if you try to set the DAZ bit on hardware that doesn't support it?  Did the MXCSR register change?  Was it an unused bit and setting it is just inaffective?

0 Kudos
2 Replies
TimP
Honored Contributor III
424 Views
I think I remember CPUs where it was possible to flip the DAZ bit with no effect. According to my understanding, the Corei7-2 architecture is supposed to eliminate the effect of FTZ and DAZ settings on performance in the cases normally encountered.
0 Kudos
Rich_Skorski
Beginner
424 Views
Thanks for the info! My core I7-2600 does handle denormals the same as normal floats for certain instructions. I don't have an extensive list of how they all perform, but I profiled pairs of addps and mulps instructions over 100,000,000 iterations. Here are my results, they're estimates in milliseconds: addps 58.5 normals 58.5 denormals 58.5 FTZ+DAZ 58.5 DAZ 58.5 FTZ mulps 59 normals 8050 denormals 59 FTZ+DAZ 59 DAZ 4120 FTZ I can't complain about that, in fact I'm impressed that addps works just as fast with or without denormals. I was tipped off about the difference of denormal handling between certain instructions from research a man by the name of Bruce Dawson had done, http://www.altdevblogaday.com/2012/05/20/thats-not-normalthe-performance-of-odd-floats/. I've attached the code that is profiled, for anyone who is curious. Addps and Mulps are the important functions, the rest sets MXCSR with the right flags and copys normal/denormal into source.
0 Kudos
Reply