Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29236 Discussions

Are 1D arrays faster than 2D arrays ?

inkant
Novice
2,442 Views
Dear Users,

Compiler : Ifort -Version 12.0.2
System : RHEL v6, x86_64 , Intel Quad core Xeon
I just changed a code which used array of shape A(i) to use A(i,j) in one of the most calculation intensive part.
I noticed this performance degradation (using in built time command of unix) -
With a 1 dimension array
1st run
real 0m1.425s
user 0m1.373s
sys 0m0.006s
2nd run
real 0m1.393s
user 0m1.385s
sys 0m0.008s
With a 2 dimension array
1st run
real 0m2.649s
user 0m2.642s
sys 0m0.008s
2nd run
real 0m2.648s
user 0m2.640s
sys 0m0.007s
Is this a performance degradation by using a 2d array (because of probably array indexing)?
Best Regards,
Inkant
5 Replies
jimdempseyatthecove
Honored Contributor III
2,442 Views
In Fortran, a 2D array has best memory access when placing the left most array index in the inner loop (with C/C++ it is the other way around).

do J=1,nJ
do I=1,nI
A(I,J) = B(I,J) ...
...
end do
end do

You may need to look at and rework your loops.
For 3D, A(I,J,K)make K the outer most loop, J the middle loop, I the inner loop.

Jim Dempsey
inkant
Novice
2,442 Views
Yes Jim,
The 2D array indices were varied according to the way you suggested.
In addition,
The 1D array subroutine had few conditionals to be evaluated, but 2D array was free of conditionals(which made it surprising to me to see a performance degradation).
The difference in the two subroutines was only that with 1D array, there was only one loop (with conditionals), but 2D array had two embedded loops.
Inkant
0 Kudos
jimdempseyatthecove
Honored Contributor III
2,442 Views
Can you post the code?

In a nested loop, the compiler usually can usually registerize the outer loop base index to the array. However, in Debug mode this would not necessarily be the case (especially with index out of bounds checking if enabled).

Also are you passing the double subscripted arrays as arguments to a subroutine/function? If so, then how you declare the arguments with/without interface can affect performance.

Jim Dempsey
0 Kudos
inkant
Novice
2,442 Views
Jim,
The repeatability of the time is not good. I am trying to figure out why, after which I will post the code.
Inkant
0 Kudos
Ron_Green
Moderator
2,442 Views
repeatability: what are you doing about linux lazy page allocation, or "demand paged" or "first time paging effects"? In other words, you DO touch each element before starting a timing loop, yes? OR you run enough interations to cancel the first time effects? If this sounds foreign to you, google "demand paging" or "lazy page".

but then you have to worry about things like vector intrinsics library routines being substituted for initialization or simple element movement - did you add option -nolib-inline to keep the compiler from replacing your code with a library equivalent? And if you have manually coded a matrix multiply, in 12.0 the -opt-matmul may replace your code with an MKL library call.

What have you done to align the data on 16 byte boundaries? You are using ALLOCATEable data, yes?

Are your timers accurate or are you using cpu_time() or get_time_of_day or equivalent? And the code is running for a minute or more so you are not looking at clock jitter, yes?

you may search this forum for other array performance questions. Typically "is array syntax as fast as hand coded loops?", etc. (btw - the answer is "most of the time they are equivalent unless you are doing something silly.") These are very frustrating studies, as the complexity of optimizing compilers can be doing numerous manipulations that you would not anticipate. And toy examples often oversimplify a real application. I'd recommend working with a real application rather than trying to draw conclusions from overly simplifed loop structures. BUT if you have a real-world solver that you're trying to optimize, there are a number of us on the forum interested in studying it.


Reply