I was enjoying reading the Intel 17 Fortran documentation when I read this page:https://software.intel.com/en-us/node/678362 about setting the FTZ and DAZ flags.
First this page has a lovely table to explain the FTZ and the DAZ compiler flags. So I go to Linux and load the module for the Intel 17 compilers and test out ifort with -ftz and -daz. The ftz flag works, but the compiler reports that it doesn't recognize the -daz option when I try to use that compiler flag. So I go back to the compiler option page and look for some flag related to DAZ the alphabetical list of compiler flags (https://software.intel.com/en-us/node/677967). I can not find anything related to DAZ. So what is this discussion on the first page about ftz and daz flags? What is the actual compiler flag related to DAZ that is shown in the table.
Second, the page describing the ftz flag https://software.intel.com/en-us/node/678362 references a negative form of the ftz flag without specifying the syntax or spelling of the negative ftz flag. I looked for the syntax in the alphabetical list of compiler options (https://software.intel.com/en-us/node/677967) to look for the syntax of the negative form. I could not find it there either. Eventually I found the syntax defined in the Intel 16 Fortran compiler documentation (-no-ftz). Why not make the Intel 17 fortran compiler as readable as the 16 compiler documentation?
Sorry for the confusion. The discussion on Setting the FTZ/DAZ flags is not about compiler “flags” (i.e. options or switches – depending on your preference) but rather in this context “flags” is synonymous with “bits” within the MXSCR register. Both FTZ and DAZ are influenced by -ftz option as discussed within the topic.
I will notify the Tech writers about the missing negative form of the option within the Setting the FTZ/DAZ topic. It would help to include. It does appear on the -ftz option page in 17.0 (https://software.intel.com/en-us/node/678138). The negative forms of options are not included Alphabetical listing.
Hope that helps.
Be careful with -ftz, especially if you use library routines that do not expect flush-to-zero and denormals-are-zero mode to be selected in the MXCSR! Note also that catch-all flags such as -fast can set FTZ and DAX on. Misuse of -ftz can introduce bugs that are very hard to catch. See this cautionary tale: https://forums.roguewave.com/showthread.php?1427-Strange-bug-(-)-in-FNL7-routine-IVOAM&p=4077&viewfu... .
Intel compilers use -ftz (on by default) only in the initialization compiled into main program. The setting has no effect on subroutines. You can over-ride with USE ieee_arithmetic subroutine ieee_set_underflow_mode(), so that would be the most legitimate way in which the setting could be changed later (besides the SSE intrinsic support by most C/C++ compilers). You might note that IEEE_arithmetic is not guaranteed to work unless -fp-model strict is set, but other settings will not prevent ieee_set_underflow_mode from taking control.
Beginning with Intel Sandy Bridge CPUs, the performance implications of -no-ftz were corrected as far as addition (but not multiplication) is concerned, so there is something to be said for making a practice of using that setting, at least for host CPUs and targets which use glibc or newlib. I used to take care to set abrupt underflow for gcc/gfortran when comparing performance with Intel compilers, but don't find the need for that on recent CPUs.
indicates that MIC KNL (like KNC) is not designed to perform well with setting other than -ftz.
Contrary to the URL mentioned by mecej4, the Microsoft math libraries used by ifort and icl expect abrupt underflow setting and can't be relied upon with sub-normals. It seems a dilemma if IMSL expects a different setting from Microsoft libraries.
In my experience, it is difficult to find a performance implication of DAZ setting (assuming that FTZ is set for a CPU which needs it). It would apply only to data which are initialized by constants or read from a data file or external application, if the current application is set so that it doesn't generate sub-normals.
As far as ifort 17 documentation of DAZ/FTZ is concerned, there is still a page in the html docs entitled "Setting the FTZ and DAZ Flags"
Thanks all for the helpful comments. I was not familiar with calling register settings flags so I am better enlightened now. I hope the compiler documentation is improved so I am not guessing at format (no:ftz, ftz:no, no-ftz, ftz-no, . . .). I was not previously aware (or forgot) about assists for KNL with denormals. I was looking into this for a colleague. I wrote some code creates denormals; I compiled this with and without ftz and it ran as expected each time. My colleague reports with his Fortran code he gets a message like this shown below - which seems to indicate to me that he is using x87 instructions instead of scalar SIMD instructions. I tried adding -m80387 to the compiler command for my sample code but I can not find any way to get Fortran code to produce this message and he hasn't yet provided compiler options.:
An Intel-specific flag that is raised when an operand to a floating-point
arithmetic operation is denormal, or a single- or double-precision denormal
value is loaded on the x87 stack. This flag is not raised by SSE
arithmetic when the DAZ control bit is set. */
#define FE_DENORMALOPERAND 0x0002
Since the Intel 8080, possibly earlier, microprocessors have been provided with a FLAGS register, and special instructions to test individual flags and take short jumps depending on whether a flag was set or not. On the FPU, there are two sets of "flags": one set, the SW (status word) to display the machine state, and another, the CW (control word) to control the mode of operation (round/chop, interrupt on zero-divide, etc.).
There is a detailed write-up on X87 flags (16-bit Control and Status Words) and the SSE MXCSR at https://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz .
This test program (which is not standard-conformant) demonstrates gradual underflow (when compiled with -ftz-) and abrupt underflow (when compiled with default options):
program denorm real x integer i,ix equivalence(ix,x) x=1e-37 do i=1,10 write(*,'(1x,Z8.8)')ix x=x*0.5 end do end