Optimizing 5d look up table interpolation

Petri__Fabrizio · ‎01-09-2020

Hi, I written a code in order to make a look up table that, through linear interpolation, is able to get the real value y as function of the independent variable vector x(5). In particular the code execute a linear interpolation in 5d.
I'm trying to optimize the code. The main operations of interpolation are carried out in 'lookup_table_5d_mod.f90', subroutine GetColAtLoc, line 203. I'm wondering how to minimize access time to table stored in memory during interpolation operator. In your opinion, should be useful pre-fetching data used during interpolation, in L2 cache?
You will find comments in the source code. Thank you.

jimdempseyatthecove · ‎01-09-2020

Too little information is available to provide a correct assessment of what to do. This said...., my assumption is...

The computational latencies involves with:

          !Trilinear interpolation in (/x(1),x(2),x(3),x4Low,x5Low)
          y2d(1,1) = sum(w*This%m_Y(i:i+1,j:j+1,k:k+1,m,l))
          
          !Trilinear interpolation in (/x(1),x(2),x(3),x4High,x5Low)
          y2d(2,1) = sum(w*This%m_Y(i:i+1,j:j+1,k:k+1,m+1,l))
          
          !Trilinear interpolation in (/x(1),x(2),x(3),x4Low,x5High)
          y2d(1,2) = sum(w*This%m_Y(i:i+1,j:j+1,k:k+1,m,l+1))
          
          !Trilinear interpolation in (/x(1),x(2),x(3),x4High,x5High)
          y2d(2,2) = sum(w*This%m_Y(i:i+1,j:j+1,k:k+1,m+1,l+1))

and the latencies (without examination of disassembly code) are likely due to the construction temporary arrays from This%m_Y of shape (2,2,2), in four places and used in the four expressions ...sum(w*This%m_Y(...

Provided that the contents of This%m_Y are used/reused many times after being defined in LT5d_CreateFromData (NewLT%m_Y => SpaceData)...
Then it may be beneficial (latency) to redefine m_Y a linear array of an array of type double precision x(2,2,2) and construct the linear index from i,j,k,m,l.

type a222
   double precision :: a(2,2,2)
end type a222

!Class
type , extends(LookupTableClass) :: LookupTable5dClass
   ...
!  double precision, pointer     :: m_y(:,:,:,:,:)
   type(a222), allocatable :: m_y(:,:,:) ! first index constructed from i, j, k GetColAtLoc

While this increases the storage size, it should eliminate the temporary creation and gather into the temporary.

(this works provided that the orginal m_Y's are reused a sufficient number of times since creation)

Jim Dempsey