- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

It is common sense that FORTRAN store array in column-major order. But following code gives a unexpected results. (Row-major is faster)

Tested version : ifort version 18.0.1

[Results]

do i= ; do j= ;

A[i,j] = something … ==> 1.532 s

do j= ; do i= ;

A[i,j] = something … ==> 5.095 s

Thanks in advance.

program test_simple_calculation parameter (IMAX=10000,JMAX=100) integer i,j,k real s1,s2 real a(IMAX,JMAX),b(IMAX,JMAX),c(IMAX,JMAX) real d(IMAX,JMAX) write(*,*) 'The for sum :' a=1.0e-1;b=1.0e-1;c=1.0e-1; call time_check(s1) do k=1,100 do i=1,IMAX do j=1,JMAX c(i,j)=a(i,j)+b(i,j) enddo enddo enddo call time_check(s2) d=c write(*,*) 'total time in sec. ',s2-s1 write(*,*) '=================================' write(*,*) 'The for sum :' a=1.0e-1;b=1.0e-1;c=1.0e-1; call time_check(s1) do k=1,100 do j=1,JMAX do i=1,IMAX c(i,j)=a(i,j)+b(i,j) enddo enddo enddo call time_check(s2) d=c write(*,*) 'total time in sec. ',s2-s1 write(*,*) '=================================' stop end subroutine time_check(s) integer values(1:8) real s call DATE_AND_TIME(values=values) write(*,*) values(5),values(6),values(7),values(8) s=real(values(6))*60.+real(values(7))+real(values(8))*0.001 return end

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Compiling with the equivalent of -O3 here, I see minimal difference in timing (two milliseconds for the first set of loops, one millisecond for the second set). Inspecting the assembly shows that the second set of loops is collapsed into a single loop, while the loops of the first set have been reordered. Both sets drop the outer loop.

More generally - note that the value of d is not used in the program. This means that the value of c is not used, which means that there is no point calculating a + b, which means that there is no point executing any of the loops. Sometimes the optimiser will be smart enough to figure this sort of stuff out (at least here it appears to have figured out the k loop is not required), so you need to design your tests appropriately and always check assembly or similar to see what it is that you are actually testing.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page