Solved: Re: Division-by-zero under false condition

foxtran · ‎09-16-2024

Hello!

I caught division-by-zero (Arithmetic exception) with the following code:

      program example
      use, intrinsic :: ieee_exceptions, only: ieee_invalid
      use, intrinsic :: ieee_exceptions, only: ieee_set_halting_mode
      implicit none
      integer, external :: dblalloc
      real(8) :: memory(2**20)
      common /mem/ shift
      integer :: shift
      integer :: i, to_alloc, to_fill
      integer :: ifb, inb

      call ieee_set_halting_mode(ieee_invalid, .true._4)

      memory=0E0_8
      shift = 1

      to_alloc = 15
      to_fill= 4

      ifb=dblalloc(to_alloc)
      inb=dblalloc(to_alloc)
      call fill_n(memory(ifb), to_fill)
      call fill_n(memory(inb), to_fill)
!
        do i = 1, to_alloc
         if (memory(ifb+i-1)*memory(inb+i-1) < 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else if(memory(inb+i-1) <= 1E-12_8) then
           memory(ifb+i-1) = 1E+10_8
         else if (memory(ifb+i-1) <= 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else
           memory(ifb+i-1) = memory(ifb+i-1) / memory(inb+i-1)
         endif
         print '(f25.10)', memory(ifb+i-1)
        enddo
!
      end program example
      subroutine fill_n(arr, len)
        implicit none
        integer, intent(in) :: len
        real(8) :: arr(len)
        call random_number(arr)
        arr = 0.1_8 + arr
      end subroutine fill_n

dblalloc implementation:

      integer(8) function dblalloc(size)
        implicit none
        common /mem/ shift
        integer(8) :: shift
        integer(8), intent(in) :: size
        dblalloc = shift
        shift = shift + size
      end function dblalloc

Compilation:

ifx -O3 -i8 dblalloc.f example.f -o example.exe

Used compiler version:

$ ifx --version
ifx (IFX) 2024.1.0 20240308
Copyright (C) 1985-2024 Intel Corporation. All rights reserved.

Issue happens with divsd operation. There is an assembler (`-fasm=intel`) of compiled cycle:

.LBB0_2:
	movsd	xmm0, qword ptr [rbx + 8*r13 + example_$MEMORY-8]
	movsd	xmm1, qword ptr [r14 + 8*r13 + example_$MEMORY-8]
	movapd	xmm2, xmm1
	mulsd	xmm2, xmm0
	xorpd	xmm3, xmm3
	ucomisd	xmm3, xmm2
	movsd	xmm2, qword ptr [rip + .LCPI0_1]
	ja	.LBB0_6
	movapd	xmm2, xmm0
	cmpnlepd	xmm2, xmm3
	divsd	xmm0, xmm1                        ; SIGFPE
	movsd	xmm3, qword ptr [rip + .LCPI0_2]
	cmpnlepd	xmm1, xmm3
	andpd	xmm1, xmm2
	movd	eax, xmm1
	test	al, 1
	jne	.LBB0_5
	movsd	xmm0, qword ptr [rip + .LCPI0_1]
	jmp	.LBB0_5

It seems to me that issue comes with ifx 2024.1.0, since for ifx 2024.0.0 another assembler listing is generated, where divsd is after `test` & `je` instructions:

.LBB0_2:
        movsd   xmm0, qword ptr [rbx + 8*r13 + example_$MEMORY-8]
        movsd   xmm1, qword ptr [r14 + 8*r13 + example_$MEMORY-8]
        movapd  xmm2, xmm1
        mulsd   xmm2, xmm0
        xorpd   xmm3, xmm3
        ucomisd xmm3, xmm2
        movsd   xmm2, qword ptr [rip + .LCPI0_1]
        ja      .LBB0_6
        movapd  xmm2, xmm0
        cmpnlepd        xmm2, xmm3
        movapd  xmm3, xmm1
        movsd   xmm4, qword ptr [rip + .LCPI0_2]
        cmpnlepd        xmm3, xmm4
        andpd   xmm3, xmm2
        movd    eax, xmm3
        test    al, 1
        je      .LBB0_5
        divsd   xmm0, xmm1 ; divsd is after test & je
        movapd  xmm2, xmm0
        jmp     .LBB0_6

full ifx 2024.0.0 assembler is available here: https://godbolt.org/z/bxz7sKMvb

Assembler for ifx 2024.2.1 looks close to ifx 2024.1.0, so the issue should also happen with ifx 2024.2.1.

@Igor_V_Intel , could you please have a look?

Igor_V_Intel · ‎09-17-2024

After some additional investigation it looks like there is no bug here. The compiler has started taking more advantage of FP semantics. The default FP model (-fp-model fast) assumes that FP exceptions can't be thrown, so that it is always safe to hoist a floating-point operation. The '-fp-model strict' option will guarantee that such optimizations are avoided and help in your case to run fine.

Something like this will obfuscate the if-else pattern enough to stop the hoisting too:

         integer :: flag
         flag = 0
         if (memory(ifb+i-1)*memory(inb+i-1) < 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else if(memory(inb+i-1) <= 1E-12_8) then
           memory(ifb+i-1) = 1E+10_8
         else if (memory(ifb+i-1) <= 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else
           flag = 1
         endif
         if (flag .eq. 1 .AND. memory(inb+1-1) .NE. 0) then
           memory(ifb+i-1) = memory(ifb+i-1) / memory(inb+i-1)
         endif

View solution in original post

Igor_V_Intel · ‎09-16-2024

Thank you for reporting it. The bug is still in the latest compiler version. It looks like SimplifyCFGPass optimization pass (enabled with O1) causes the problem. I have escalated it to the development team to correct the bug.

Igor_V_Intel · ‎09-17-2024

After some additional investigation it looks like there is no bug here. The compiler has started taking more advantage of FP semantics. The default FP model (-fp-model fast) assumes that FP exceptions can't be thrown, so that it is always safe to hoist a floating-point operation. The '-fp-model strict' option will guarantee that such optimizations are avoided and help in your case to run fine.

Something like this will obfuscate the if-else pattern enough to stop the hoisting too:

         integer :: flag
         flag = 0
         if (memory(ifb+i-1)*memory(inb+i-1) < 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else if(memory(inb+i-1) <= 1E-12_8) then
           memory(ifb+i-1) = 1E+10_8
         else if (memory(ifb+i-1) <= 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else
           flag = 1
         endif
         if (flag .eq. 1 .AND. memory(inb+1-1) .NE. 0) then
           memory(ifb+i-1) = memory(ifb+i-1) / memory(inb+i-1)
         endif

foxtran · ‎09-17-2024

Oh! Thank you!

In the reality, it could be rewritten as follows:

        do i = 1, to_alloc
         if(memory(inb+i-1) <= 1E-12_8) then
           memory(ifb+i-1) = 1E+10_8
         else if (memory(ifb+i-1) <= 0E0_8) then
           memory(ifb+i-1) = 1E+10_8
         else
           memory(ifb+i-1) = memory(ifb+i-1) / memory(inb+i-1)
         endif
         print '(f25.10)', memory(ifb+i-1)
        enddo

Then, the bug will disappear. The first condition can be not checked since it could happen if one of variables is less than zero, but these cases are checked in other two if's.