I want to inquire the possibility of creating a two dimensional array in Fortran with variable size of columns which is possible in other languages.
Do we have alternative solution to using a derived type for the purpose? For example,
type VarSizedArray integer, allocatable :: col(:) end type VarSizedArray type(VarSizedArray),allocatable :: row(:) allocate(row(2)) allocate(row(1).col(1)) allocate(row(2).col(2)) row(1).col(1) = 11 row(2).col(1) = 21 row(2).col(2) = 22 print*, row(1) print*, row(2)
Thank you very much Arjen.
I just did not want to miss the chance of using a shorter way. Sometimes performance is comparable in the way you implement some parts of the code. I always have the notion that the more primitive a data type is, the better is the performance. (might be a false belief) And, final reason is that alternative solutions come up in time by the people I work with, I am surprised being not aware since because I had made my way with a solution I think that could be the only way.
In your example, you are using an array of arrays. Perhaps a better and more efficient way of doing this would be to use only 2 arrays. The first one stores all the values, and the second one indicates where the values for each row start.
program test2d implicit none integer,allocatable:: val(:) integer,allocatable:: rowptr(:) integer nnz, nrow, i, j1,j2 nrow = 2 ! total number of rows nnz = 3 ! total number of values for all rows allocate( val(nnz), rowptr(nrow+1) ) val = [ 11, 21, 22 ] rowptr = [ 1, 2, 4 ] do i = 1, nrow j1 = rowptr(i) j2 = rowptr(i+1) - 1 print*, val( j1 : j2 ) end do stop end program test2d
Thank you Roman. I can't free up my mind from the physics of the problem when implementing it in a program. This reveals the fact for me that every problem needs further elaboration for internal storage and computational performance after the definition of the physics.
Your suggestion opens up some other possibilities for my problem. I will share some performance results as soon as I find time to prepare a preliminary implementation.
Another alternative is to use an ordinary two-dimensional array where the second dimension is equal to the maximum you need and use a parallel array holding the second dimension per row. Something like (inspired by Roman's example):
program test2db implicit none integer,allocatable:: val(:,:) integer,allocatable:: rowlen(:) integer maxrowlength, nrow, i, jmax nrow = 2 ! total number of rows maxrowlength = 2 ! maximum length of the row allocate( val(nrow,maxrowlength), rowlen(nrow) ) val = reshape( [ 11, 21, 22, 0 ], [nrow, maxrowlength] ) rowlen = [ 2, 1 ] do i = 1, nrow jmax = rowlen(i) print*, val( i, 1 : jmax ) end do stop end program test2db
You waste some memory of course, but addressing the rows is slightly simpler. It might be faster as well, as the section of the array you need is more predictable. But whether that is an advantage in practice is an aspect that needs to be measured, not divined.
There may or may not be some issues with both Roman's and Arjen's suggestions with large datasets.
Arjen is using a rectangular array to hold sparse data. This may induce memory capacity issues for very large datasets. Roman's method is not subject to this (over allocation) issue.
Both solutions require knowing the entire array size prior to allocation whereas in the original proposal (#1) you only have to determine the next row size. Prior to selecting method, considerations regarding memory capacity and initial allocation need to be made.
Both methods can be aided by helper functions that return element reference and row reference. Column reference could return an array of integers holding the linear indexes in Roman's suggestion, but for sparse arrays you would need to specify value to use for missing elements (0.0, NaN, -n, ...)
Arjen, Jim thanks for your contributions.
Arjen, I used your suggestion on a 2D solver once in a while. I liked it since it was close to physical definition of the problem. In exchange of avoiding some functions or some further thoughts on the computer implementation, I struggled a lot with indices with the problem physics in mind.
I am going to stick to Roman's suggestion and decided to order data that helps me avoid second dimension for most of the solutions.
Jim, I think it does not matter to know the size of the data before hand, it is always possible to insert or append.
And also if the data is ordered, original typed solution has a better pointer alternative which will not be better that Roman's suggestion that reduces into 1D array.