Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
15 Views

matmul can give wrong results when code is compiled with -O3

The following code produces the correct result when compiled with -O2 but the results can wrong with -O3.

$ ifort  -O2 test_bug.F90 && ./a.out
 size(H,2),n           10          10 T
 HPH   100.0000       100.0000       100.0000       100.0000


$ ifort  -O3 test_bug.F90 && ./a.out
 size(H,2),n           10          10 T
 HPH  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00

I am using ifort 14.0.1 20131008 but a colleague confirms that this error also affects the more current version with the version string 2015.5.223_ilp64.

The provided code is a minimal program with produces the error and seemingly unrelated statements affects whether the result is correct or not.

Thank you for your help!

program test_rrsqrt
 implicit none
  integer, parameter :: m = 5
  integer, parameter :: n = 10  
  real :: H(m,n)
  H = 1
  call testing_local_analysis_covar(H)
contains

 subroutine testing_local_analysis_covar(H)
  implicit none
  real, intent(in) :: H(:,:)
  real :: Pf(size(H,2),size(H,2))
  real :: HPH(2,2)
  integer :: mloc
  real, allocatable :: Hloc(:,:)
  real :: A(n,2), B(2,2)

  Pf = 1

  ! bug is not triggered if one these two lines are commented
  A = matmul(Pf, transpose(H))
  B = matmul(H,matmul(Pf,transpose(H)))

  mloc = 2
  write(6,*) 'size(H,2),n ',size(H,2), n, size(H,2) == n

  allocate(Hloc(mloc,size(H,2))) ! tiggers bug
!  allocate(Hloc(mloc,n))  ! does no trigger bug

  Pf = 1
  Hloc = 1

  HPH = matmul(Hloc,matmul(Pf,transpose(Hloc))) ! -> does not work, unless allocate(Hloc(mloc,n))
!  HPH = matmul(matmul(Hloc,Pf),transpose(Hloc)) ! -> works!
  write(6,*) 'HPH', HPH
  deallocate(Hloc)
 end subroutine testing_local_analysis_covar
end program test_rrsqrt


 

0 Kudos
9 Replies
Highlighted
15 Views

Thanks - I can reproduce this

Thanks - I can reproduce this and we'll investigate further.

Retired 12/31/2016
0 Kudos
Highlighted
Beginner
15 Views

Possible Google Search

Possible Google Search

0 Kudos
Highlighted
15 Views

Escalated as issue

Escalated as issue DPD200407800. I will update this thread when I learn more.

Retired 12/31/2016
0 Kudos
Highlighted
15 Views

The developers tell me that

The developers tell me that one of the optimization phases hits an internal limit and gives up, leaving the internal representation in an unstable state. Until this is fixed, you can use the (undocumented) option -qoverride-limits as a workaround to allow the phase to complete. I tested this and it does work for your example (without taking any noticeable more time to compile.)

Retired 12/31/2016
0 Kudos
Highlighted
Beginner
15 Views

Thank you very much for

Thank you very much for your helpful insight! Is there a chance that future versions of ifort will accept this code directly with -O3?

0 Kudos
Highlighted
15 Views

Yes, I certainly hope so!

Yes, I certainly hope so! That you get wrong code at -O3 is a bug. I don't know how it will get fixed, but it will get fixed. I just wanted to give you a workaround for now. When I hear more from the developers, I will let you know here.

Retired 12/31/2016
0 Kudos
Highlighted
15 Views

This has been fixed for

This has been fixed for Update 3, due in May.

Retired 12/31/2016
0 Kudos
Highlighted
Beginner
15 Views

Great! Thank you very much

Great! Thank you very much for resolving this issue!

0 Kudos
Highlighted
Beginner
15 Views

"-fp-model precise" seems to

"-fp-model precise" seems to work around it too.

$ ifort -mkl -O3 -fp-model precise mm.f90 && ./a.out
 size(H,2),n           10          10 T
 HPH   100.0000       100.0000       100.0000       100.0000
$ ifort -mkl -O3 mm.f90 && ./a.out                 size(H,2),n           10          10 T
 HPH  0.0000000E+00  0.0000000E+00  0.0000000E+00  0.0000000E+00
$ ifort -mkl -O2 mm.f90 && ./a.out
 size(H,2),n           10          10 T
 HPH   100.0000       100.0000       100.0000       100.0000

 

0 Kudos