Uninitialize data access

JohnNichols · ‎11-27-2019

   ! REAL(KIND=DP), allocatable :: XAlloc(:)
   ! REAL(KIND=DP), allocatable :: DXAlloc(:)
   ! REAL(KIND=DP), allocatable :: DDXAlloc(:)
   ! REAL(KIND=DP), allocatable :: H20(:)
  !  REAL(KIND=DP), allocatable :: Freq(:,:)                   ! Frequency Set
  !  REAL(KIND=DP), allocatable :: dxR(:)                      ! Real Velocity vector
     REAL(KIND=DP)  XAlloc(3)
     REAL(KIND=DP)  DXAlloc(3)
     REAL(KIND=DP)  DDXAlloc(250)
     REAL(KIND=DP) H20(mn_8)
     REAL(KIND=DP) Freq(3,mn1)                   ! Frequency Set
     REAL(KIND=DP) dxR(250)                      ! Real Velocity vector

  !          ALLOCATE (XAlloc(ModelA%N), DXAlloc(ModelA%N),DDXAlloc( ModelA%numb), H20(mn_8), ModelA%Freq(ModelA%N,mn_3),dxR( ModelA%numb),stat = err)

  !          write(*,211)err
 !           write(sRB,211)err

201         Format('             Number of equations                           :: ',I4)
211         Format('             Error code on allocation of deflection arrays :: ',i4)

I am having a problem with the allocate statements - they appeared to work until I tried to include DISLIN graphing package. As long as the DISLIN calls were before the alloc statements they worked, but after I get a heap error on the call to one of the DISLIN subroutines.

I pulled out all the DISLIN and implemented a DXF drawer - this worked until I tried to create a second function line on the graph, the allocated data in being transferred from data analysis routine to the drawing routine does not transfer correctly.

I was wondering how I debug allocate errors, I have not had this problem before.

The issue is this is an RKN ODE solver, so the number of equations can vary and the input data varies depending on the equations so I was trying to avoid using declared arrays.

These are general ODE's using LEAR's algorithm from NASA.

help appreciated.

John

Steve_Lionel · ‎11-27-2019

Insufficient information, and the sources you attached aren't complete. What kind of "heap error"? What do you mean by "on the call to one of the DISLIN routines"? At the call itself, or during the routine?

Generally, a problem of this kind that appears when you add calls implies a data corruption issue, where something wrote past the end (or start) of an allocatable array. Intel Inspector XE can sometimes detect such errors, but my experience is that it tends to also put out a lot of noise for Fortran code.

JohnNichols · ‎11-27-2019

    read(si,*)numb
    write(*,800)numb
    write(srB,800)numb

    ModelA%numb = numb
800 Format('   Number of time steps :: ',I4)
    
        ALLOCATE (xR(ModelA%numb),stat = err)
        ALLOCATE (DDXAlloc(ModelA%Numb),stat = err)

Dear Steve:

Thank you for the response. You certainly confirmed the issue of how hard it is to solve this type of problem.

I made a mistake, there are two critical numbers, N the number of equations and numb the number of time steps. I allocated the large data arrays that relied on numb before numb was set - so it was unknown. This was not a problem when I only had one array DDXAlloc, it worked as the data was read into the array at the "END" of the heap and did not get overwritten, but when I put in DISLIN it must have written to the heap and DDXALLOC must have interfered. I tried to solve the problem by using a DXF drawer, but as soon as I wanted 2 curves I created xR and did it before numb was initialized so it overlapped the real DDXALLOC - shows up in the plot of the results you can see the second curves is a partial copy of the real acceleration curve.

So I followed your ideas and simply added one at a time and found the error and then the plot told me what was happening and solving it only took a short while,

but why does allocate not throw an error if the length of the array is some unknown number?

Merry thanksgiving

John

andrew_4619 · ‎11-27-2019

Presumably it is a number even if you have not set it which is maybe why it 'works'. If you in debug have the run time error check set for uninitialsed variables then when an uninit is touched the first time it throws an error so you find such errors very quickly.

jimdempseyatthecove · ‎11-27-2019

What this sounds like to me is an allocation occurs on one side (either Fortran or DISLIN) and the address and size is passed to the other side for use. However the size allocated on the one side is less than the size required on the other side. This could corrupt the heap .OR. trash data that was allocated just following the errant array.

Add full runtime diagnostics (array bounds checking, uninitialized variables, unallocated variables). While this can catch the errors on your Fortran code (except for passing the wrong array size), it won't do anything in the DISLIN code.

If the runtime diagnostics does not point you to the problem, you might be able to "hack" an attempted fix by making your allocations larger than required

    read(si,*)numb
    write(*,800)numb
    write(srB,800)numb

    ModelA%numb = numb
800 Format('   Number of time steps :: ',I4)
    
        ALLOCATE (xR(ModelA%numb + paddThatWorks),stat = err)
        ALLOCATE (DDXAlloc(ModelA%Numb + paddThatWorks),stat = err)

*** be careful to specify array slice as opposed to whole array;

xR(1:ModelA%numb) = ...

Jim Dempsey

JohnNichols · ‎11-27-2019

I tried all of the combinations I could think of to generate an error using the uninit variable, an exception was thrown but not at the use of the null variable in the allocate-- it just continued on -- stopping at the end with a exception that the heap is corrupted. You need to have two allocation errors to see the results in the output.

Interesting problem

andrew_4619 · ‎11-27-2019

Allocating at zero length is not an error, it is valid. Your description suggests the variable was not uninitialsied, it just wasn't properly initialised.

JohnNichols · ‎11-27-2019

 Type Model
        Integer             TypeAnal            ! Type 1 is sample and Type 2 is 2 axle truck on simply supported bridge 
        Integer             numb                ! Number of time steps 
        Character*70        Description         ! Job Description
        integer             N                   ! Number of equations 
        REAL (KIND=dp)      h                   ! Tolerance on the results
        REAL (KIND=dp)      XStart              ! Starting point for the analysis
        REAL (KIND=dp)      t0                  ! Starting time - usually zero

is in the module

in the main code is

TYPE (Model), TARGET :: ModelA

CALL cpu_time(t(1))

numb is not set to any assigned value until after a read statement but after the type statement it has a value of zero.

Fortran has over many years had some interesting behaviour, I well remember Powerstation causing matrix errors that disappeared years later and had not occured before with Fortran 3.3 from Microsoft -- bring back the floppies and the COMPAQ Portable

So you are correct -- the behaviour is acceptable provided that the array is the last thing on the heap and the heap is long.

John

JohnNichols · ‎11-27-2019

The really interesting challenge is the use of second order general ODE solvers using the RKN methods. In reading a lot of papers people solve the problem a lot but leave out the velocity component if you are considering Newtonian physics nomenclature. But that does not work in reality.

Thank God for Lear at NASA for the algorithms and J Williams at ERC.

Although J Williams ODE stuff is now not findable on his site at degenerateconic.com -- this is worth a look as is his twitter feed.

jimdempseyatthecove · ‎11-27-2019

I noticed that you are passing two sizes for your arrays N and numb (RKNC does this).

It is not unusual for a programmer to incorrectly interchange N and numb by mistake.

As a means to catch these errors, in your module procedures replace (N) and (numb) with (:) on the dummy argument declarations

REAL (KIND=dp) h,xstart,X(:),xf(:),xdf(:),t0,dt,dX(:),xddf(:),lx(:),lxon(:), load(:),loadQ(:), PLoad, QLoad,omega,KT(:),K1,K2,lp, rdot(:),adt, tau, ddx(:),dxR(:)

Likewise elsewhere in modules. Also, should you have non-module procedures that have dummy variables declared similarly, then perform the same edit *** but also add the -gen-interfaces or /gen-interfaces as the case may be.

Note, when using (N) and (numb) on the dummy arguments, the array bounds checking will take you at your word as to the size of the arrays. When using (:), and interfaces, the array descriptor passes the caller's known/assumed size. Then with array bounds checking enabled you will get out-of-bounds indications from the callers known sizes.

Jim Dempsey

jimdempseyatthecove · ‎11-27-2019

Or with (:,:) for 2D arrays

Jim Dempsey

JohnNichols · ‎11-27-2019

andrew_4619 wrote:
Presumably it is a number even if you have not set it which is maybe why it 'works'. If you in debug have the run time error check set for uninitialsed variables then when an uninit is touched the first time it throws an error so you find such errors very quickly.

As soon as it is created it has a zero value, so the unint does not detect the zero. The problem comes when i use the zero lenght array and it reads garbage from the heap.

It just takes care.

JohnNichols · ‎11-27-2019

jimdempseyatthecove (Blackbelt) wrote:
I noticed that you are passing two sizes for your arrays N and numb (RKNC does this).
It is not unusual for a programmer to incorrectly interchange N and numb by mistake.
As a means to catch these errors, in your module procedures replace (N) and (numb) with (:) on the dummy argument declarations
REAL (KIND=dp) h,xstart,X(:),xf(:),xdf(:),t0,dt,dX(:),xddf(:),lx(:),lxon(:), load(:),loadQ(:), PLoad, QLoad,omega,KT(:),K1,K2,lp, rdot(:),adt, tau, ddx(:),dxR(:)
Likewise elsewhere in modules. Also, should you have non-module procedures that have dummy variables declared similarly, then perform the same edit *** but also add the -gen-interfaces or /gen-interfaces as the case may be.

Note, when using (N) and (numb) on the dummy arguments, the array bounds checking will take you at your word as to the size of the arrays. When using (:), and interfaces, the array descriptor passes the caller's known/assumed size. Then with array bounds checking enabled you will get out-of-bounds indications from the callers known sizes.

Jim Dempsey

I will try this -- thank you -- they are interesting errors -- I did interchange N and numb and this was the first set of errors. the later errors are more subtle.

If I create a type in a module, place a variable say int m in the type as soon as I declare the type in the main code the m has a value of zero that is not caught by uninuit check. If I use the m to allocate the last array - GH(:) as I had been doing the program reads the correct data, but as soon as I expanded the problem, the new array at the end came in at about GH(3) and overwrote GH - but GH still exists and if you read it you get some correct and some crap.

I tested it out with some graphs and could demonstrate the offset.

A bit like commons with undeclared variables -- you may as well give up

Thanks again

jimdempseyatthecove · ‎11-28-2019

Uninitialize data access errors will not catch all such errors. It is like a spotting dog... some times she misses.

Do not assume uninitialized variables have a value of 0.

>> If I use the m to allocate the last array - GH(:) as I had been doing the program reads the correct data, but as soon as I expanded the problem, the new array at the end came in at about GH(3) and overwrote GH - but GH still exists and if you read it you get some correct and some crap.

I suggest you insert some diagnostic code, I suspect whatever is causing this problem will be completely obvious once you see it.

The diagnostics could look something like this:

print *,"some identifying text such as subroutine name and line number"
if(allocated(GH) then
  print *,"GH(",size(GH),") at ", c_loc(GH(1)
else
  print *,"GH not allocated"
endif

Place that not only after allocation, but also in other procedures that use GH.

Note too that your indexing variable into GH could be goofed up too.

Jim Dempsey

JohnNichols · ‎11-28-2019

Jim:

I fully agree, this is a nice solution, thanks.

When I have all the equations in and working I will go back and implement this algorithm.

Happy thanks giving.

John

Allloc errors