Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28446 Discussions

Unnecessary temporary array creation by Intel Visual Fortran Compiler

DataScientist
Valued Contributor I
2,141 Views

Consider the following Fortran code,

module Matrix_mod
    use iso_fortran_env, only: RK => real64, IK => int32
    implicit none
contains
    pure subroutine getCholeskyFactor(nd,PosDefMat,Diagonal)
        implicit none
        integer(IK), intent(in)    :: nd
        real(RK)   , intent(inout) :: PosDefMat(nd,nd)
        real(RK)   , intent(out)   :: Diagonal(nd)
        real(RK)                   :: summ
        integer(IK)                :: i
        do i=1,nd
            summ = PosDefMat(i,i) - dot_product(PosDefMat(i,1:i-1),PosDefMat(i,1:i-1))
            if (summ <= 0._RK) then
                Diagonal(1) = -1._RK
                return
            end if
            Diagonal(i) = sqrt(summ)
            PosDefMat(i+1:nd,i) = ( PosDefMat(i,i+1:nd) - matmul(PosDefMat(i+1:nd,1:i-1),PosDefMat(i,1:i-1)) ) / Diagonal(i)
        end do
    end subroutine getCholeskyFactor
end module Matrix_mod

use Matrix_mod, only: IK, RK, getCholeskyFactor

implicit none

real(RK), allocatable :: PosDefMat(:,:,:)
integer :: nd

nd = 3

allocate(PosDefMat(nd, 0:nd, 0:nd))

call getCholeskyFactor(nd, PosDefMat(:,1:nd,0), PosDefMat(:,0,0))

end

 

The line,

call getCholeskyFactor(nd, PosDefMat(:,1:nd,0), PosDefMat(:,0,0))

should not lead to a temporary array creation. But Intel Visual Fortran does create an array temporary for the second argument. The Intel Linux compiler does not suffer from the same issue. How can this be resolved? and practically, how worrisome is this temporary array creation performance-wise.

On a side note, the new Fortran forum does not recognize Fortran syntax. The user has to choose C or HTML for code highlighting. What a disppointment for a Fortran forum which I believe is also the largest of all in among all Intel forums, or at least it used to be.

0 Kudos
1 Solution
FortranFan
Honored Contributor II
2,024 Views
@A__King wrote:

Is there possibly a way to guarantee continuity to the compiler to avoid temporary array creation?

 

@DataScientist ,

Guarantee!?  No!  But a workaround that may be acceptable in many circumstances in numerical and engineering code?  Possibly yes.  For this, first keep in mind the old middle-eastern saying that was captured well by Bacon in his Essays: https://www.phrases.org.uk/meanings/if-the-mountain-will-not-come-to-muhammad.html.   Then, bite one's lips and try the ASSOCIATE construct to mimic an array temporary that really isn't one.  Here's a simplified version of your code:

module m
   use, intrinsic :: iso_c_binding, only : c_loc, c_size_t
contains
   subroutine sub( n, a )
      integer, intent(in) :: n
      integer, intent(inout) :: a(n,n)
      integer(c_size_t) :: addr_a11
      addr_a11 = transfer( source=c_loc(a(1,1)), mold=addr_a11 ) !<-- 'a' needs TARGET attribute for this check
      print "(g0,1x,z0)", "In sub: address of a(1,1): ", addr_a11
      a(1,1) = 42
   end subroutine
end module
   use m
   integer, parameter :: n=3
   integer, allocatable, target :: x(:,:,:)
   integer(c_size_t) :: addr_x010
   allocate( x(0:n,0:n,0:n) )
   addr_x010 = transfer( source=c_loc(x(0,1,0)), mold=addr_x010 )
   print "(g0,1x,z0)", "In main: address of x(0,1,0): ", addr_x010
   x = 0
   associate ( y => x(:,1:n,0) )
      call sub( n, y )
   end associate
   print *, "x(0,0,0) = ", x(0,0,0), "; expected is 0."
   print *, "x(0,1,0) = ", x(0,1,0), "; expected is 42."
end

 

Upon execution using Intel Fortran - see no array temporary 

C:\Temp>ifort /standard-semantics /check:arg_temp_created p.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.3.311 Build 20201010_000000
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.26.28806.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\Temp>p.exe
In main: address of x(0,1,0):  2922BCFCCA0
In sub: address of a(1,1):  2922BCFCCA0
 x(0,0,0) =  0 ; expected is 0.
 x(0,1,0) =  42 ; expected is 42.

C:\Temp>

 

Note the version without the ASSOCIATE will flag a temporary:

C:\Temp>type p.f90
module m
   use, intrinsic :: iso_c_binding, only : c_loc, c_size_t
contains
   subroutine sub( n, a )
      integer, intent(in) :: n
      integer, intent(inout) :: a(n,n)
      integer(c_size_t) :: addr_a11
      addr_a11 = transfer( source=c_loc(a(1,1)), mold=addr_a11 ) !<-- 'a' needs TARGET attribute for this check
      print "(g0,1x,z0)", "In sub: address of a(1,1): ", addr_a11
      a(1,1) = 42
   end subroutine
end module
   use m
   integer, parameter :: n=3
   integer, allocatable, target :: x(:,:,:)
   integer(c_size_t) :: addr_x010
   allocate( x(0:n,0:n,0:n) )
   addr_x010 = transfer( source=c_loc(x(0,1,0)), mold=addr_x010 )
   print "(g0,1x,z0)", "In main: address of x(0,1,0): ", addr_x010
   x = 0
   call sub( n, x(:,1:n,0) )
   print *, "x(0,0,0) = ", x(0,0,0), "; expected is 0."
   print *, "x(0,1,0) = ", x(0,1,0), "; expected is 42."
end

C:\Temp>ifort /standard-semantics /check:arg_temp_created p.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.3.311 Build 20201010_000000
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.26.28806.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\Temp>p.exe
In main: address of x(0,1,0):  14C7125CCA0
forrtl: warning (406): fort: (1): In call to SUB, an array temporary was created for argument #2

Image              PC                Routine            Line        Source
p.exe              00007FF6127EC6BC  Unknown               Unknown  Unknown
p.exe              00007FF6127E128E  Unknown               Unknown  Unknown
p.exe              00007FF61283F8BE  Unknown               Unknown  Unknown
p.exe              00007FF61283FC40  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFB39087974  Unknown               Unknown  Unknown
ntdll.dll          00007FFB395DA0B1  Unknown               Unknown  Unknown
In sub: address of a(1,1):  48998FFB70
 x(0,0,0) =  0 ; expected is 0.
 x(0,1,0) =  42 ; expected is 42.

C:\Temp>

 

View solution in original post

19 Replies
DataScientist
Valued Contributor I
2,133 Views

There is no link to edit old existing posts either in the new forum. So I have to reply to my own post to make an edit. The Visual compiler being used for this test is:

Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.4.245 Build 20190417
Copyright (C) 1985-2019 Intel Corporation. All rights reserved.

0 Kudos
andrew_4619
Honored Contributor II
2,129 Views

the array chunk you are passing does not represent a continuous set of memory addresses. 

0 Kudos
DataScientist
Valued Contributor I
2,125 Views

But isn't a contiguous array one where the elements are not separated by other data objects? Is there a separation between elements in this

PosDefMat(:,1:nd,0)

chunk of memory?

0 Kudos
andrew_4619
Honored Contributor II
2,114 Views

so working through your chunk the second index is 1:nd but the array is declared 0:nd so the next value in your array after nd would be the zero index which is not part of your chunk. 

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,101 Views

Andrew, (:, 1:nd, 0) is contiguous

Consider (:, :, as the whole of a 3D array. (contiguous)
Then (:, :, 0) would be a (contiguous) plane of this array (0-based array in this example), specifically a face.
Then (:, 0, 0) would be a (contiguous) row of the above plane (starting at the base of the plane)
Then (:, 1:n, 0)  would be the (contiguous) remainder of the above plane missing the first row of the plane.

The compiler is missing an opportunity to avoid temporary creation.

Your statement "1:nd but the array is declared 0:nd so the next value in your array" applies only for an indexing such as (:, 1:n, :), (:, 1:n, 1:2), ... IOW multiple sub-planes.

Jim Dempsey

jimdempseyatthecove
Honored Contributor III
2,095 Views

Having issue with editing above to read:

Consider (:, :, ...

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,094 Views

**bleep** autocorrector

Consider ( :, :, : )

Hope this did not replace colon, right paren with (missing) emojii

Jim Dempsey

0 Kudos
andrew_4619
Honored Contributor II
2,093 Views

D'oh! Yes Jim.

As an aside there are many cases it seem when you set an index range a:b where the compiler just gives up and creates a temp even when it does not actually need to.

0 Kudos
Steve_Lionel
Honored Contributor III
2,088 Views

The forum editor loves to replace certain punctuation sequences with emoji. While it doesn't display them, you end up with blanks instead. I thought this had been turned off, but it seems not.

The compiler has some things it looks at regarding avoiding temps, and it has gotten better over the years. I suggest filing a ticket with examples where it could do better.

DataScientist
Valued Contributor I
2,070 Views

Is there possibly a way to guarantee continuity to the compiler to avoid temporary array creation?

0 Kudos
FortranFan
Honored Contributor II
2,025 Views
@A__King wrote:

Is there possibly a way to guarantee continuity to the compiler to avoid temporary array creation?

 

@DataScientist ,

Guarantee!?  No!  But a workaround that may be acceptable in many circumstances in numerical and engineering code?  Possibly yes.  For this, first keep in mind the old middle-eastern saying that was captured well by Bacon in his Essays: https://www.phrases.org.uk/meanings/if-the-mountain-will-not-come-to-muhammad.html.   Then, bite one's lips and try the ASSOCIATE construct to mimic an array temporary that really isn't one.  Here's a simplified version of your code:

module m
   use, intrinsic :: iso_c_binding, only : c_loc, c_size_t
contains
   subroutine sub( n, a )
      integer, intent(in) :: n
      integer, intent(inout) :: a(n,n)
      integer(c_size_t) :: addr_a11
      addr_a11 = transfer( source=c_loc(a(1,1)), mold=addr_a11 ) !<-- 'a' needs TARGET attribute for this check
      print "(g0,1x,z0)", "In sub: address of a(1,1): ", addr_a11
      a(1,1) = 42
   end subroutine
end module
   use m
   integer, parameter :: n=3
   integer, allocatable, target :: x(:,:,:)
   integer(c_size_t) :: addr_x010
   allocate( x(0:n,0:n,0:n) )
   addr_x010 = transfer( source=c_loc(x(0,1,0)), mold=addr_x010 )
   print "(g0,1x,z0)", "In main: address of x(0,1,0): ", addr_x010
   x = 0
   associate ( y => x(:,1:n,0) )
      call sub( n, y )
   end associate
   print *, "x(0,0,0) = ", x(0,0,0), "; expected is 0."
   print *, "x(0,1,0) = ", x(0,1,0), "; expected is 42."
end

 

Upon execution using Intel Fortran - see no array temporary 

C:\Temp>ifort /standard-semantics /check:arg_temp_created p.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.3.311 Build 20201010_000000
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.26.28806.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\Temp>p.exe
In main: address of x(0,1,0):  2922BCFCCA0
In sub: address of a(1,1):  2922BCFCCA0
 x(0,0,0) =  0 ; expected is 0.
 x(0,1,0) =  42 ; expected is 42.

C:\Temp>

 

Note the version without the ASSOCIATE will flag a temporary:

C:\Temp>type p.f90
module m
   use, intrinsic :: iso_c_binding, only : c_loc, c_size_t
contains
   subroutine sub( n, a )
      integer, intent(in) :: n
      integer, intent(inout) :: a(n,n)
      integer(c_size_t) :: addr_a11
      addr_a11 = transfer( source=c_loc(a(1,1)), mold=addr_a11 ) !<-- 'a' needs TARGET attribute for this check
      print "(g0,1x,z0)", "In sub: address of a(1,1): ", addr_a11
      a(1,1) = 42
   end subroutine
end module
   use m
   integer, parameter :: n=3
   integer, allocatable, target :: x(:,:,:)
   integer(c_size_t) :: addr_x010
   allocate( x(0:n,0:n,0:n) )
   addr_x010 = transfer( source=c_loc(x(0,1,0)), mold=addr_x010 )
   print "(g0,1x,z0)", "In main: address of x(0,1,0): ", addr_x010
   x = 0
   call sub( n, x(:,1:n,0) )
   print *, "x(0,0,0) = ", x(0,0,0), "; expected is 0."
   print *, "x(0,1,0) = ", x(0,1,0), "; expected is 42."
end

C:\Temp>ifort /standard-semantics /check:arg_temp_created p.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.3.311 Build 20201010_000000
Copyright (C) 1985-2020 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.26.28806.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\Temp>p.exe
In main: address of x(0,1,0):  14C7125CCA0
forrtl: warning (406): fort: (1): In call to SUB, an array temporary was created for argument #2

Image              PC                Routine            Line        Source
p.exe              00007FF6127EC6BC  Unknown               Unknown  Unknown
p.exe              00007FF6127E128E  Unknown               Unknown  Unknown
p.exe              00007FF61283F8BE  Unknown               Unknown  Unknown
p.exe              00007FF61283FC40  Unknown               Unknown  Unknown
KERNEL32.DLL       00007FFB39087974  Unknown               Unknown  Unknown
ntdll.dll          00007FFB395DA0B1  Unknown               Unknown  Unknown
In sub: address of a(1,1):  48998FFB70
 x(0,0,0) =  0 ; expected is 0.
 x(0,1,0) =  42 ; expected is 42.

C:\Temp>

 

DataScientist
Valued Contributor I
2,018 Views

@FortranFan This is a neat solution, thanks for sharing! I had thought of a possible solution via pointers but the `associate` never came to my mind. I did some testing on your code and the modified version with `associate`  is around 40% faster in this particular example with `-O3` flag.

I used to believe `associate` to be only syntactical flexibility. But it seems like the compiler really creates a pointer underneath. Is the `target` attribute necessary? The code compiles and runs fine without the target attribute.

0 Kudos
andrew_4619
Honored Contributor II
1,978 Views

In the example above the TARGET attribute is required by standard as in C_LOC(var), then var should have pointer or target attribute. Intel Fortran does not generated an error if it is missing , gfortran does. I raised a ticket on this some days back.   To be fair it creates no problems in ifort if it is missing, but I would prefer standard conformance to be observed if I ask for it.

0 Kudos
DataScientist
Valued Contributor I
2,057 Views

@Steve_Lionel I went ahead and submitted this issue as a service request to the Intel Developer Products Support. However, it was almost immediately dismissed solely because of the (educational) license under which I reported this bug/enhancement. I understand that Intel makes money from their compilers, and I appreciate VERY MUCH the generosity of Intel to provide their compilers free of charge to the educators and open-source community. But ignoring bug/enhancement reports solely based on the license will only deteriorate the quality of Intel products in the long run and discourages the community to invest time in reporting such bugs or product-weaknesses in the future.

0 Kudos
Steve_Lionel
Honored Contributor III
2,036 Views

@DataScientist , I agree 100% and have said so to various folks at Intel many times over the years. When I was an employee, I made sure that ANY bug reported in the forum was escalated to developers, whether or not the user had support. My position was that fixing bugs sooner than later was a net benefit to Intel (and its customers.) Sadly, current management doesn't agree with me. Don't take it out on the support team, they want bugs fixed as much as you do. I am disappointed that the support team isn't more active in this forum than they are, but I don't know what sort of directives they have been given.

DataScientist
Valued Contributor I
1,982 Views

To be fair, and for those visiting this page in the future, Intel product services and developers did come back to me 1-2 days after I submitted the report about this weakness and they said they were now investigating the issue. It's great to know that bug/enhancement reports are still welcomed at Intel regardless of who reports them.

0 Kudos
Devorah_H_Intel
Moderator
1,650 Views

Fixed in the next compiler release 2021.7

DataScientist
Valued Contributor I
1,631 Views

Fantastic, thanks for sharing the update!

0 Kudos
John_Campbell
New Contributor II
1,563 Views

Would the following change to diagonal position have the same problem ?

allocate(PosDefMat(nd, nd+1, nd))

call getCholeskyFactor(nd, PosDefMat(:,1:nd,1), PosDefMat(:,nd+1,1))

0 Kudos
Reply