Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
公告
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Reshape inplace

drraug
初学者
1,268 次查看
First, two words about background. Consider we have three-dimensional array A(M,N,K), two-dimensional array B(K,Q) and we need to compute C(M,N,Q) = sum(K) A(M,N,K) * B(K,Q). Of course, we could use cycles like that
[fortran]forall(M=1:MMAX,N=1:NMAX,Q=1:QMAX) C(M,N,Q) = dot_product(A(M,N,:),B(:,Q))[/fortran]
but this is far from being optimal. The faster approach includes Level-3 BLAS routimes call
[fortran]call dgemm('n','n',MMAX*NMAX,QMAX,KMAX,1.d0,A,MMAX*NMAX,B,KMAX,&
0.d0,C,MMAX*NMAX)[/fortran]

but since dgemm is subroutine and since it is not PURE, the resulting function also can not be declared as PURE.

We can rely on MATMUL intrinsic function, that is PURE, but it supposes rank-two arrays on input. We can perform like that

[fortran]double precision,allocatable :: atmp(:,:)
allocate(atmp(MMAX*NMAX,KMAX))
forall(M=1:MMAX,N=1:NMAX,K=1:KMAX)atmp(M+(N-1)*MMAX,K)=A(M,N,K)
C=MATMUL(atmp,B)
deallocate(atmp)[/fortran]

but there we need additional array for reshaping of array A.

We can try also the following simple statement

[fortran]C=MATMUL(RESHAPE(A,[MMAX*NMAX,KMAX]),B)[/fortran]

This looks fine, but I am not sure that compiler can do this without implicit allocation of temporary reshaped array on stack or heap. This could greatly decrease the performance in my programs, since I plan to use such procedure in recursive functions.

Could you give a comment on this? Shortly speaking, I need an inplace RESHAPE for arrays or PURE DGEMM-like function with explicitely given array dimensions. Is it possible to solve this problem or to find some simple walkaround?

0 项奖励
3 回复数
TimP
名誉分销商 III
1,268 次查看
If the case is large enough for DGEMM to be of interest, the time spent by RESHAPE, even with an implicit allocation, could not be significant, although space or cache usage might be. The advantage of DGEMM over MATMUL (as implemented by ifort) is seen where the MKL threading is advantageous, or MATMUL implies a temporary allocation which could be avoided by DGEMM.
0 项奖励
drraug
初学者
1,268 次查看

Thank you, Tim.

But let me clarify. My problem is that I want to have PURE function with matrix multiplication inside. My "matrices" are actually large rank-3 arrays.

I can use DGEMM, but then I lose PURE property.

I can use RESHAPE+MATMUL, but then I probably can not avoid temporary allocation of memory.

Is it possible to combine one with another?

0 项奖励
david_car
新分销商 I
1,267 次查看

Hi drraug,

Here's is one way to reshape in place using a pointer (you can build from this example, i.e. bounds checking, etc):

[fortran]program main
  real, pointer :: v1(:,:,:), v2(:,:,:)
  real, pointer :: v3(:,:), v4(:,:)


  allocate(v1(10, 10, 4))

  v2 => v1
  v2(2,2,2) = 5.d0

  call cast2D(v2, v3, 20, 20)
  v3(10,1) = 10.
  v3(1,2) = 21.

  print *, v3(10,1), v1(10,1,1)
  print *, v3(1,2),v1(1,3,1)

contains

  subroutine cast2D(v1, v3, n, m)
    real, dimension(n,m), target :: v1
    real, dimension(:,:), pointer :: v3
    integer :: n, m

    v3 => v1
  end subroutine cast2D

end program main
[/fortran]

0 项奖励
回复