Resize Array

brunocalado · ‎12-06-2009

Is there a way to resize a allocatable array without lose the data already in there?

Sample:

integer :: I

integer, allocatable :: A

ALLOCATE(A(10))

A =(I, I=1,10)

!Now I want to add more 10 elements, but I cant lose what I did

Thank you!

Tim_Gallagher · ‎12-06-2009

Quoting - brunocalado

Is there a way to resize a allocatable array without lose the data already in there?

Sample:

integer :: I

integer, allocatable :: A

ALLOCATE(A(10))

A =(I, I=1,10)

!Now I want to add more 10 elements, but I cant lose what I did

Thank you!

Not easily... at least not using Fortran 90/95. You would have to do something like:

SUBROUTINE ResizeArray(A, newSize)
IMPLICIT NONE

INTEGER, DIMENSION(:), INTENT(INOUT) :: A
INTEGER, INTENT(IN) :: newSize

INTEGER, DIMENSION(:), ALLOCATABLE :: B

ALLOCATE(B(LBOUND(A):UBOUND(A))

B = A

DEALLOCATE(A)

ALLOCATE(A(newSize))

A(LBOUND(B):UBOUND(B)) = B

DEALLOCATE(B)
END SUBROUTINE ResizeArray

Basically you make a temporary copy of your data, destroy the original array, then reallocate it to the new size and move the data back in. I just wrote that routine off the top of my head, no promises it's flawless. For instance, the starting index of A may change this way if you don't start from 1 since it's inside a subroutine.

But, if you do this inline with your code, or if A always starts from 1, it should be okay.

In languages like Matlab that "automatically" do this for you, I have a feeling (unproven) that they just do the above steps internally. That's why arrays that grow inside a DO loop in Matlab get really slow really fast if the loop is big.

Tim

rreis · ‎12-07-2009

Quoting - tgallagher2114

Quoting - brunocalado

Is there a way to resize a allocatable array without lose the data already in there?

Sample:

integer :: I

integer, allocatable :: A

ALLOCATE(A(10))

A =(I, I=1,10)

!Now I want to add more 10 elements, but I cant lose what I did

Thank you!

Not easily... at least not using Fortran 90/95. You would have to do something like:

SUBROUTINE ResizeArray(A, newSize)
IMPLICIT NONE

INTEGER, DIMENSION(:), INTENT(INOUT) :: A
INTEGER, INTENT(IN) :: newSize

INTEGER, DIMENSION(:), ALLOCATABLE :: B

ALLOCATE(B(LBOUND(A):UBOUND(A))

B = A

DEALLOCATE(A)

ALLOCATE(A(newSize))

A(LBOUND(B):UBOUND(B)) = B

DEALLOCATE(B)
END SUBROUTINE ResizeArray

Basically you make a temporary copy of your data, destroy the original array, then reallocate it to the new size and move the data back in. I just wrote that routine off the top of my head, no promises it's flawless. For instance, the starting index of A may change this way if you don't start from 1 since it's inside a subroutine.

But, if you do this inline with your code, or if A always starts from 1, it should be okay.

In languages like Matlab that "automatically" do this for you, I have a feeling (unproven) that they just do the above steps internally. That's why arrays that grow inside a DO loop in Matlab get really slow really fast if the loop is big.

Tim

Hmmm... if fortran passes by reference (memory address), if you deallocate A don't you loose the reference to the A array? Does it knows then the A out is the new allocated array? Wouldn't it be better to use pointers? The growing array being just a pointer wich you would change to the evergrowing array?

[plain]real, dimension(:), allocatable, target :: cur, temp
real, dimension(:), pointer :: A

allocate(cur(10))

A => cur

..do things...

allocate(temp(10))
temp = A
deallocate(cur)
allocate(cur(new_dim))

cur(1:10) = temp(:)
deallocate(temp)

A => cur
[/plain]

or something like it. I also think there's a penalty to keep deallocating and allocating arrays, better to allocate a big chunk and just writting to it (and then, if necessary, increase it)

hoped to make sense...

Hirchert__Kurt_W · ‎12-07-2009

Quoting - tgallagher2114

Quoting - brunocalado

Is there a way to resize a allocatable array without lose the data already in there?

...

Not easily... at least not using Fortran 90/95. You would have to do something like:

...

Tim is right about Fortran 90/95, but ifort supports the Fortran 2003 intrinsic subroutine MOVE_ALLOC which makes things somewhat less obnoxious:

integer, allocatable :: TEMP(:)
...
ALLOCATE(TEMP(LBOUND(A,1):UBOUND(A,1)+newspace)) ! or whatever new bounds you want
TEMP(LBOUND(A,1):UBOUND(A,1)) = A ! or wherever you want the old data to go in the new array
CALL MOVE_ALLOC(FROM=TEMP,TO=A)

Note that the call to MOVE_ALLOC effectively deallocates A, moves the new allocation to A, and leaves TEMP deallocated.

This paradigm gives you great flexibility: You control the new size and bounds of the array in the first statement. You control what part of the data you wish to keep and where you want it in the second statement.

I wrote my example with some generality. If, for example, you know your lower bound will always be 1, you can simplify this code by eliminating the calls to LBOUND. If you have variables that already contain the old or new sizes of A, you may be able to eliminate one or both of the calls to UBOUND. The essential elements of this paradigm are the following:

1. the declaration of a temporary allocatable array with the same rank, type, and type parameters as the array whose size you are changing (If you are resizing several arrays with the same rank, type, and type parameters, you can reuse the same temporary array for all of them)

2. an ALLOCATE statement to allocate the temporary array to the new size you want

3. one or more statement to transfer the data you wish to keep from the old array to the new

4. a call to MOVE_ALLOC to make the new allocation now be the one referred to by your "permanent" allocatable array

Does this sound like something you can live with?

-Kurt

Tim_Gallagher · ‎12-07-2009

I haven't ever done anything in F2003, so maybe this isn't correct, but isn't there another way also? I believe, unless I'm reading it wrong, F2003 will automagically resize arrays for you during assignment if the right hand size is larger than the left. See section 7.1.4.3.

So that makes this a little simpler still because you would just need:

A = (/ (stuff(I), I=1,10) /)

to set up the first size. Then when you need to add to it:

A = (/ (A(I), I=1,SIZE(A)), newStuff1, newStuff2, ... /)

which should copy the first SIZE(A) elements back to A, then add however many new things you need. So it could all be done in one line.

But, let's remember that even this method will be very slow when arrays get large because internally it is still deallocating then allocating stuff. You can't just tack on extra space into arrays because there is no guarantee the memory after the end of the array isn't already claimed somewhere in the code or by another process.

The best bet is to always allocate to the size you think you need, and only expand as a last resort. If somebody knows F2003 and can comment if that will work, let me know. Like I said, I've never used it but it seems reasonable.

Tim

Steven_L_Intel1 · ‎12-08-2009

To get the automatic reallocation that Tim mentions, you must compile with "-assume realloc_lhs".

Hirchert__Kurt_W · ‎12-08-2009

Some miscellaneous comments on Tim's post:

1. This reallocation when the LHS is allocatable takes place any time the shape is different, whether it is bigger or smaller.

2. There is no section 7.1.4.3 in my copy of F2003. I think you meant 7.4.1.3.

3. To simplify the notation even more, you don't need to use an implied-DO to reference A when adding stuff:

A = (/ A, newstuff1, newstuff2, ... /)

4. This approach is, of course, dependent on knowing what values you want for newstuff1, newstuff2, .... This often isn't true. You can pick an arbitrary value (like zero) to put here, but such initialization is at least marginally more expensive than leaving some of the new space uninitialized.

5. This is notationally simpler for 1-dimensional arrays. As soon as you go to 2-dimensional arrays or higher, the notation for the RHS begins to get complicated. It's still possible to write everything, but it may be easier and clearer to write assignment statements as in the paradigm I presented.

6. I would be concerned about the possible cost of using this notation. A naive implementation of this notation might do twice as many ALLOCATEs, DEALLOCATEs, and data copies as are done using the paradigm I presented. The real question is whether the compiler is clever enough to recognize that it has an allocated temporary on the RHS and internally do the equivalent of MOVE_ALLOC to reallocate and assign A rather than doing a separate DEALLOCATE, ALLOCATE, and assign to the LHS. Can anyone comment on whether ifort is clever enough to do this?

If concise notation is your primary concern (and you are working with 1-dimensional arrays), I would go with Tim's suggestion (with my small modification in point number 3), but if speed is your primary concern, I would stick with the explicit MOVE_ALLOC paradigm unless you know that all the compilers you will be using do the optimization to convert Tim's notation into the equivalent of a MOVE_ALLOC. Thank you, Tim, for pointing out this alternative.

-Kurt

Tim_Gallagher · ‎12-08-2009

Kurt,

Thanks for correcting those things. As I said, I haven't actually tried it (or anything in F2003) so I'm not sure on the speed/implementation issues you pointed out. Sorry for the typo on the section number...

To prevent it from deallocating/reallocating if RHS is smaller, you would have to use specific bounds on the LHS, correct? Something like:

A(1:5) = RHS

where SIZE(RHS)=5?

But a statement like:

A = 0.0

would still set all values in A to be 0.0 without changing the size of A (assuming it is already allocated) since 0.0 is a scalar and not an array? It could be explicitly forced to do it by doing:

A(:) = 0.0

if it does misbehave and try to make SIZE(A) = 1. But I don't think it would change the size...

I do agree, your approach is much cleaner in cases except very simple ones like the example.

I imagine that the practice of reallocating the LHS based on the size of the RHS is discouraged (hence the compiler flag to enable it)? It could obviously create some very sloppy practices and really difficult to find bugs... I wonder why it was added to the standard, especially when the MOVE_ALLOC method seems to provide a general, and safe, way to do it.

Tim

Steven_L_Intel1 · ‎12-09-2009

A(:) = whatever

does not do the automatic (re)allocation.

Hirchert__Kurt_W · ‎12-09-2009

Quoting - tgallagher2114

Kurt,

Thanks for correcting those things. As I said, I haven't actually tried it (or anything in F2003) so I'm not sure on the speed/implementation issues you pointed out. Sorry for the typo on the section number...

To prevent it from deallocating/reallocating if RHS is smaller, you would have to use specific bounds on the LHS, correct? Something like:

A(1:5) = RHS

where SIZE(RHS)=5?

But a statement like:

A = 0.0

would still set all values in A to be 0.0 without changing the size of A (assuming it is already allocated) since 0.0 is a scalar and not an array? It could be explicitly forced to do it by doing:

A(:) = 0.0

if it does misbehave and try to make SIZE(A) = 1. But I don't think it would change the size...

I do agree, your approach is much cleaner in cases except very simple ones like the example.

I imagine that the practice of reallocating the LHS based on the size of the RHS is discouraged (hence the compiler flag to enable it)? It could obviously create some very sloppy practices and really difficult to find bugs... I wonder why it was added to the standard, especially when the MOVE_ALLOC method seems to provide a general, and safe, way to do it.

Tim

Tim, you are right in pointing out that there is no reallocation if the RHS is scalar. All the cases where reallocation is done are cases that were not legal in F90/F95 because of shape mismatches.

MOVE_ALLOC was added to F2003 draft very late in the process. Unless my memory is playing tricks on me, "magic" reallocation of the LHS was already in the draft at that time. From the point of view of the committee, they were intended to address different problems. In the case we have been talking about, we know in advance what size we want to make the array, so we ALLOCATE it the right size before assigning to it. "Magic" LHS reallocation was intended to address the cases where the size of the object needed is discovered in the process of computing the object. This kind of frequent reallocation is potentially expensive, but if your problem really is this dynamic in size, it is also potentially invaluable.

[For a number of reasons, including some of those you cite, I would have preferred this functionality be associated with a different assignment operator (say ":="), so a compiler could readily tell the difference between assignments where shape mismatches are a programming error and those where this reallocation functionality is desired, but the committee did not wish to invent a new assignment operator and went this way instead. Presumably, when Intel supports all of F2003 in ifort, they will need to change the default to -assume realloc_lhs.]

-Kurt

Steven_L_Intel1 · ‎12-09-2009

There is automatic reallocation if the left side is a deferred-length allocatable character variable. We do not plan to change the default for realloc_lhs. We do intend to have a single option you can throw which says "give me F2003 semantics for everything", rather than the half-dozen or more separate options you need to specify now for that.