Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Intrisic function sum creates errors and memory leaks ?

Noro
Beginner
1,079 Views

I recently updated my ifort compiler up to version 19.1.0.166 20191121 (in parallel studio 2020), and running my series of testcases I get some errors. 1. ERRORS: If I compile with the options -gen-interface -warn all -check all -traceback -fpe-all -fp-stack-check -ftrapuv -heap-arrays then I get the following message: forrtl: warning (406): fort: (33): Shape mismatch: The extent of dimension 3 of array DENSITYTEST is 1 and the corresponding extent of array is 2 and the backtrace is referencing to this block : do x= xlim_sup, xlim_inf, -1 do i=0, ilim_sup densitytest(x,:,:) = densitytest(x,:,:) +sum( f(x-v(i,1),:,:, i,:)*factor(x,i, :,:,:), dim=3) enddo enddo 2. MEMORY LEAK: Then if I do all this summation by myself, the program runs but I get memory leaking. I noticed the memory leak by watching the consumption of RAM since my computer was swapping and running slow while I was starting to run my program. Note that I get the same memory leaking if I compile without all the previous debugging option and with or without the summation by hand. (So I think that it is a different problem.) I isolate the memory leakage. It was not a hard task, since it was coming from the next block that use also the intrinsic function sum. The leaking is coming from the following line: do while ( any(densitytest(x,:,:)>densitymax) .AND. any(densitytest(x,:,:)>sum(f(x,:,:,0,:),dim=3)) ) Again here, if I do the summation by myself I get no memory leak. I also compiled my initial code with gfortran and I get no error, no memory leaks and every tests was giving the expected results. Also, with ifort 18 I get no such problems. So, is it a known buggus of the new version ? Or am I missing something ? Best.

0 Kudos
9 Replies
andrew_4619
Honored Contributor II
1,079 Views

There is another thread on here where there are errors with shape checking (this is new in the latest compiler), you can switch that option off 

Not sure about the memory leak. 

 

0 Kudos
Noro
Beginner
1,079 Views
Thanks Andrew_4619 for your answer. Yes, this shape checking seems to be a good track to investigate. Do you know how do I switch off this option ?
0 Kudos
andrew_4619
Honored Contributor II
1,079 Views

/check:shape so I guess /check:noshape I don't have that level loaded as yet.....

look at the run time checks  on the Fortran property page in VS for your project.

0 Kudos
Noro
Beginner
1,079 Views
Yes, on linux it is '-check noshape' . So, adding this option to the compilation and doing manually the summation of the 'do while' loop creates no error. Thus, is there a way to use the intrinsic function 'sum' without violate the shape checking ? (and the question about the memory leak remains )
0 Kudos
andrew_4619
Honored Contributor II
1,079 Views

The problem with the code sample in #1 is the sum is using array slices that are not contiguous so I think the code will be creating temporary arrays on the stack, that can be a problem with stack overflow if they are large. That aside I am guessing there could be a leak in the code that makes the temps.

what do you mean my 'manually doing' do you mean making nested loops to do the summation element by element? mAybe pasting some code would be best.

And finally I suggest making a small self contained code example that demonstrates the leak for others to have a play with and so it can be sent to Intel support....

 

0 Kudos
Noro
Beginner
1,079 Views
I mean but manual summation, I mean do it with a do-loop like : sumtest=0 do p= 1, np sumtest(:,:)= sumtest(:,:) +f(x-v(i,1),:,:, i,p)*factor(x,i,:,:,p) enddo
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,079 Views

In your original code

do x= xlim_sup, xlim_inf, -1
 do i=0, ilim_sup
 densitytest(x,:,:) = densitytest(x,:,:) +sum( f(x-v(i,1),:,:, i,:)*factor(x,i, :,:,:), dim=3)
 enddo
 enddo

The content of the sum is a vector expression (array product) that is producing unnecessary products as you are then only extracting the sum of the resultant array's dim3. You should consider producing the temporary array product of the cells that will produce the values to be summed.

Jim Dempsey

0 Kudos
Noro
Beginner
1,079 Views
Thanks Jim for your suggestion. However, I do not get what would the benefit of creating a temporary 3-D array to do the operation f(x-v(i,1),:,:, i,:)*factor(x,i, :,:,:), then use the function sum on dimension 3, vs a do-loop over the third dimension and create en temporary 2-D array containing directly the summation of the product. To make it clearer, what is the gain of your suggestion (if I understood well) : do x= xlim_sup, xlim_inf, -1 do i=0, ilim_sup temp3D(:,:,:) = f(x-v(i,1),:,:, i,:)*factor(x,i, :,:,:) densitytest(x,:,:) = densitytest(x,:,:) +sum( temp3D, dim=3) enddo enddo vs do x= xlim_sup, xlim_inf, -1 do i=0, ilim_sup temp2D=0 do p=1,pmax temp2D(:,:) = temp2D(:,:) + f(x-v(i,1),:,:, i,p)*factor(x,i, :,:,p) enddo densitytest(x,:,:) = densitytest(x,:,:) +temp2D(:,:) enddo enddo Also, could you explain a bit why ?
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,079 Views

The original code is illustrating a compiler bug (memory leak as reported).
The original code is also inefficient in two ways:

1) The array expression is generating temporaries (compiler doing this)
2) Due to your preference of index order, the array slices are not contiguous (and thus experience collection into a contiguous temporary)

The reason for the collection into a contiguous temporary is such that the CPU SIMD instructions can be exploited.

You might try the following with your post #1 code

block
  integer :: I1, I2, I3
  real, dimension(:,:,:) allocatable :: Temp3D, Prod
  real, dimension(:,:) allocatable :: Temp2D
  I1 = size(factor, dim=3)
  I2 = size(factor, dim=4)
  I3 = size(factor, dim=5)
  allocate(Temp2D(I1,I2), Temp3D(I1, I2, I3), Prod(I1, I2, I3))
  
  do x= xlim_sup, xlim_inf, -1
    do i=0, ilim_sup
      Temp3D = f(x-v(i,1),:,:, i,:) ! copy slice into contiguous memory
      Prod = factor(x,i, :,:,:)     ! copy slice into contiguous memory
      Prod = Prod * Temp3D
      Temp2D = sum( Prod, dim=3)
      densitytest(x,:,:) = densitytest(x,:,:) + Temp2D
    enddo
  enddo 
end block

I haven't tested the above as to if it eliminates all expression temporaries.

Note, if you swap the x index of densitytest to last index then the last statement of the inner loop is vectorizable.

Jim Dempsey

0 Kudos
Reply