Solved: assumed-size and reshape

hdankowski · ‎07-12-2010

Hi,

I was wondering when to use assumed-size or assumed-shape arrays in Fortran 90.

First question: Is the assumed-size feature an obsolescent feature in F2003 or later?

Next point: How to correctly pass arrays of varying shape in Fortran 95 and later? There are some algorithms which can be used for any shape of arrays like for example sorting routines. This was in F77 no problem with the assumed-size feature. In F95, using assumed-shape or explicit shape statements, one needs to duplicate a lot of code for different array ranks.

I thought a good replacement would be the 'reshape' or 'pack' function but these two will generate a complete new array, at least thats what my debugger tells if you compare the array adress of the first element. To illustrate this, an example is attached to this post. The subroutine test_reshape1 expects a 1-dim array. To use it with a 2-dim array, I use the reshape routine, which generates a new array!

On the other hand, using assumed-size arrays like in test_reshape2 no new array is generated. I also compared the performance of the two methods and the assumed-size one was almost three times faster than the explicit shape declaration (with the -fast option)! (see subroutine test_bounds.f90)

But the problem with the assumed-size method occurs with generic interfaces fpr different types. This is given in the module 'm_test1' of the given example.

Calling it with
call test_assumedsize(n,a)
will throw the following error at compile time:
test_reshape.f90(20): error #6285: There is no matching specific subroutine for this generic subroutine call. [TEST_ASSUMEDSIZE]
call test_assumedsize(n,a)

What is the preferred way, if one likes to use one algorithm for different array shapes?

Regards,
Hendrik Dankowski

clabra · ‎07-12-2010

Hi,

It is possible just consider the change of shape directly passing the first element or a verctor starting in the first element?
In that case the interface test_assumedsize() working well:

[fortran]call test_assumedsize(2*n,a(:,1))
[/fortran]

With higher rank the procedure it is the same (At less you don't have a copy, like with reshape).
For a rank=3 with the array a(n1,n2,n3) just consider the first dimension:

[fortran]call test_assumedsize(n1*n2*n3,a(:,1,1))
[/fortran]

Carlos

View solution in original post

Steven_L_Intel1 · ‎07-12-2010

Assumed-size arrays are not obsolescent.

Fortran, to date, does not have an easy way of dealing with array arguments of varying rank (number of dimensions). There is a proposal as part of a "Technical Report" on enhanced C interoperability that would add DIMENSION(..) as "assmed-rank". Such a dummy argument would match any rank. (This TR also defines TYPE(*) as assumed-type.)

Assumed-size arrays will match only arrays of the same rank, which is why you get the error. In today's Fortran, you'd have to declare a separate specific procedure for each possible rank of the array A. This obviously gets out of hand when you have multiple arguments and/or types of arguments, not to mention that Fortran 2008 raises the maximum rank to 15!

If you are going to write a routine that does not care what the shape is, you might consider declaring that argument as type C_PTR from ISO_C_BINDING and passing C_LOC(arg) to it. That will eliminate all type and rank information. You can then use C_F_POINTER in the called routine to "cast" it as a variable of whatever type is convenient. Note that you'll lose the ability to distinguish by type this way.

clabra · ‎07-12-2010

Hi,

It is possible just consider the change of shape directly passing the first element or a verctor starting in the first element?
In that case the interface test_assumedsize() working well:

[fortran]call test_assumedsize(2*n,a(:,1))
[/fortran]

With higher rank the procedure it is the same (At less you don't have a copy, like with reshape).
For a rank=3 with the array a(n1,n2,n3) just consider the first dimension:

[fortran]call test_assumedsize(n1*n2*n3,a(:,1,1))
[/fortran]

Carlos

joseph-krahn · ‎07-12-2010

There are some sequence-association rules that are valid for external procedures, but not module procedures, such as passing a COMPLEX array for a REAL dummy argument. For module procedures, you can always pass an array element for an array dummey argument of the same type. So, if you are importing an existing routine into a module, you have to pass "A(1,1)" or "A(1,1,1)" instead of just "A".

If you want to use assumed shape with dimensions passed by a fortran data structure, you will have to create a specific procedure for each rank (and type) you want to support.

joseph_battelle · ‎07-12-2010

1) My understanding is that sequence association occurs during argument association of explicit shape dummies for module procedures (F2003, 16.4.1.1 and 12.4.1.2)--i.e., you can use any rank actual if you pass the overall size in the OP's test_bounds1 procedure below. See below which compiles and runs fine; am I missing something? I guess the compiler would produce a copy if the array wasn't contiguous but otherwise it should do the bounds remapping efficiently.

[fortran]module M15
contains
    subroutine test_bounds1(n,a)
      integer, intent(in   ) :: n
      real,    intent(inout) :: a(n)
      integer  :: i
      
      print *, 'size=', n, size(a)

      do i=2,n-1
        a(i)= a(i)-1
        a(i)= a(i-1)+a(i)-a(i+1)
        a(i)= a(i)+1
      enddo

    end subroutine

    subroutine test
    real, allocatable :: w(:,:,:)
    real :: v(64,64)
        
        allocate( w(64,64,64) )
        forall (i=1:64, j=1:64) v(i,j) = real(i*j)
        forall (k=1:64) w(:,:,k) = v
        call test_bounds1(size(v), v)     
        call test_bounds1(size(w), w)        
    end subroutine test
end module M15

program C15
    use M15
    call test
end program C15[/fortran]

2) Also, FWIW, if one takes Steve's suggestion of using the 2003 c interop features you might consider wrapping the C pointer in a derived type (not class) and use generics to create an interface minimizing the coupling to this approach in your codebase.

3) Lastly pointer bounds remapping (not supported yet I think) could provide yet another approach. Your native arrays could be rank 1 and you could have derived type facades that provide a bounds-remapped rank N pointer to play with when desired.

hdankowski · ‎07-12-2010

Hi,

thanks to all of you, great work! The suggested version passing a(:,1) works perfect for me, everything else would be a lot more complicated I guess.

One last question: I also added these performance measure routines (test_bounds[1-5]), these are my results:

with -fast:

-0.5028266 0.8567867
1 time (explicit n) : 0.698
-0.5028266 0.8567867
2 time (l/ubound) : 1.603
-0.5028266 0.8567867
3 time (size) : 1.598
-0.5028266 0.8567867
4 time (l/ubound var): 1.614
-0.5028266 0.8567867
5 time (assumed size): 0.264

without optimization (-O0):

-0.5028266 0.8567867
1 time (explicit n) : 3.101
-0.5028266 0.8567867
2 time (l/ubound) : 4.964
-0.5028266 0.8567867
3 time (size) : 4.980
-0.5028266 0.8567867
4 time (l/ubound var): 4.978
-0.5028266 0.8567867
5 time (assumed size): 1.950

cat /proc/cpuinfo gives:
...
model name : Intel Core i5 CPU 650 @ 3.20GHz
cache size : 4096 KB
cpu cores : 2
...

First, due to the simplicity of the problem, this gives great potential for optimization (impressing!).

But I'm also wondering about the great differences for the different ways of passing arrays. The lbound/ubound and size routines seem to have great impact on the performance, the assumed-size way is probably the simplest one for the compiler. Only the first adress is passed, then work on the next n elements in memory.

Does anyone have further explanation for these great differences in performance?

Regards,
Hendrik

joseph_battelle · ‎07-12-2010

FWIW, in my example above with a rank(2) v, the code the compiler generates for undecorated vrelying on sequence association for bounds-remapping to a rank(1) array and v(:,1) at the call site is exactly the same. See below. I don't really get why you want to litter the code with (:,1)'s.

[fortran]        call test_bounds1(size(v), v)
004020FA  add         esp,0FFFFFFF8h 
004020FD  mov         dword ptr [esp],offset ___xt_z+19Ch (4FB41Ch) 
00402104  lea         eax, 
0040210A  mov         dword ptr [esp+4],eax 
0040210E  call        M15_mp_TEST_BOUNDS1 (401D8Ah) 
00402113  add         esp,8 
        call test_bounds1(size(v), v(:,1))     
00402116  add         esp,0FFFFFFF8h 
00402119  mov         dword ptr [esp],offset ___xt_z+19Ch (4FB41Ch) 
00402120  lea         eax, 
00402126  mov         dword ptr [esp+4],eax 
0040212A  call        M15_mp_TEST_BOUNDS1 (401D8Ah) 
0040212F  add         esp,8[/fortran]

hdankowski · ‎07-12-2010

Hi,
your version works if you call the routine directly (as it does with an assumed-size argument) but does not work with the mentioned generic interfaces. See my attached example:

test_reshape1.f90

The error I get, if you uncomment line 46 and 47:
test_reshape1.f90(46): error #6285: There is no matching specific subroutine for this generic subroutine call. [TEST_BOUNDS1]
call test_bounds1(size(v), v)
-------------^
test_reshape1.f90(47): error #6285: There is no matching specific subroutine for this generic subroutine call. [TEST_BOUNDS1]
call test_bounds1(size(w), w)

Regards,
Hendrik

joseph_battelle · ‎07-12-2010

Got it. Thanks-missed that. It's been an enlightening thread. Thanks for starting it. I guess one thing to point out is that for the v(:,1) approach the passed array must be contiguous and the coderelies on the compiler not making a copy of the actual argument (because the copy would be only what is passed, the first row.) The approach relies on a guarantee that the standard doesn't provide--my understanding is that processors are free to copy-in/out as they please as long as the argument association rules are respected.

clabra · ‎07-13-2010

I understand that the contiguous memory should be warrantied if you use static or allocatable arrays.

In any case, just need to make some test using compiler option "-warnarg_temp_created" in order to be sure about the copy.

Carlos

hdankowski · ‎07-13-2010

Thanks for your hint! Remark: the compiler option is not a warning but a check (-check arg_temp_created) and it gives no message in my example
Hendrik

Hirchert__Kurt_W · ‎07-13-2010

Quoting joseph.battelle

Got it. Thanks-missed that. It's been an enlightening thread. Thanks for starting it. I guess one thing to point out is that for the v(:,1) approach the passed array must be contiguous and the coderelies on the compiler not making a copy of the actual argument (because the copy would be only what is passed, the first row.) The approach relies on a guarantee that the standard doesn't provide--my understanding is that processors are free to copy-in/out as they please as long as the argument association rules are respected.

If you pass v(:,1) [which, incidentally, is normally considered the first column, not the first row], that first column is all the standard allows the dummy argument to be associated with, but if you pass v(1,1) [that is, the first element of that first column], the dummy argument can be associated with every element of the array from that element to the last element of the array.

clabra · ‎07-13-2010

Yes, my mistake. Is a run-time check.

Carlos

joseph_battelle · ‎07-13-2010

Quoting hirchert

If you pass v(:,1) [which, incidentally, is normally considered the first column, not the first row], that first column is all the standard allows the dummy argument to be associated with, but if you pass v(1,1) [that is, the first element of that first column], the dummy argument can be associated with every element of the array from that element to the last element of the array.

As for row vs column I'm afraid my C/C++ background betrays me--thanks for the reminder.

Based on my code earlier in the thread v(1,1) and v are equivalent from a sequence association standpoint. From a generic dispatch standpoint v has rank(2) and v(1,1) is a scalar--that is they both mess up generic dispatch to a rank(1) array and don't meet the OP's requirement. I prefer v unless you really need an offset for specific call sites.

Having said this I still wouldn't publish code that relies on the v(:,1) trick and I'd look into one of the templating engines for fortan if I didn't want to proliferate source code for all the generics. Thanks all.