Community
cancel
Showing results for
Did you mean: Beginner
79 Views

## Vectorizing 3D Arrays Product in Fortran

Considering the product of arrays with size nx*ny*nz in Fortran. The given cases will generate the same result:

#1
```do k=1,nz
do j=1,ny
do i=1,nx
A(i,j,k)=A(i,j,k)*B(i,j,k)*0.5
enddo
enddo
enddo```

#2

```do ijk=1,nx*ny*nz
A(ijk,1,1)=A(ijk,1,1)*B(ijk,1,1)*0.5
enddo```

#3

`A(:,:,:)=A(:,:,:)*B(:,:,:)*0.5`

#4

`A=A*B*0.5`

Question: which case (#1, #2, #3 or #4) is the best in terms of vectorization and performance and why?

Thank you,
Ricardo
Tags (1)
3 Replies Black Belt
79 Views

#4 "should" be best and equivalent to #1

To improve performance you would want to align the arrays (A and B) to cache line address

!DIR\$ ATTRIBUTES ALIGN: 64:: A, B

#2 won't work if array bounds checking is enabled. If array dimensions for aligned allocations do not produce rows (1st index) of multiples of cache line, for the example above #1 can have the loops collapsed (see IVF index)

Jim Dempsey Black Belt
79 Views

At full optimization, ifort would perform an optimization equivalent to #2 if possible.  #3 is prone to generation of temporary arrays.  ifort optimization reports should show when either of those happen. Black Belt
79 Views

FWIW I use a similar technique to #2. In a module I have:

```    subroutine ArrayDivide6x6(a,d)
implicit none
real(i8), intent(inout) :: a(6*6)
real(i8), intent(in) :: d
a = a / d
end subroutine ArrayDivide6x6
...
REAL(i8),DIMENSION(6,6), INTENT(OUT) :: qq_Transpose
...
call ArrayDivide6x6(qq_Transpose, exp(xsi*z))

```

You should be able to use a similar technique with your example code.
YMMV

I noticed in IVF V17u1 an optimization issue with
qq_Transpose = qq_Transpose / exp(xsi*z)

Jim Dempsey 