variable size array at compilation time without allocate

jpedregosa · ‎07-20-2011

Hi I am having trouble with a simple declaration, and out of desesperation I decided to post in this forum.

My problem is that i want to have an array that it's size is determined at compilation time. I do not want to use allocate (speed issues). I have use the technique in the following code for years:

program main1
implicit none
integer, parameter :: i1_max = 2000
integer, parameter :: i2_max = 2000
call main2(i1_max,i2_max)
end program main1

subroutine main2(i1_max,i2_max)
implicit none
integer, intent(in) :: i1_max, i2_max
integer, dimension(i1_max,i2_max) :: N_1
N_1(1,1) = 1;
end subroutine main2

With the above code, I have no error at compilation time.

However at execution time, I get a segmentation fault if i1_max = i2_max = 2000. However I do not get any error if i1_max = i2_max = 20. (this are just test values, I did not amussed myselfs to try more)

It is not a problem of memory, the machine has plenty of it (12Gb).

On top of that, if I declare the size explicitly in the subroutine:

integer, dimension(2000,2000):: N_1

Then there is no problem either!!! Showing again that is no problem of memory.

Just for information, I am using ubuntu11.04 in an intel xeon W3690, ifort2011.
Final note: if I compile the same code with gfortran I have no problems. However i would like to stay with ifort as it generates faster code...

I really have no idea of what is going on. If somebody could help me out I would really appreciated.

Jofre

Ron_Green · ‎07-20-2011

you need to read http://software.intel.com/en-us/articles/determining-root-cause-of-sigsegv-or-sigbus-errors/

But your statement that you're doing this because of a speed advantage over ALLOCATE may only hold water if your data is pointer based. A plain, allocatable array without the pointer attribute will have zero performance penalty over what you're doing here.

ron

Ron_Green · ‎07-20-2011

oh, and a couple of other notes:

I don't see you using interfaces or modules. Your other subroutines and functions are in modules or have interface statements, correct? The reason I ask is you seem very performance concious. With old F77 external procedures you can get a lot of data copies for arguments. Creating interfaces or putting procedures in modules that are USEd can help the compiler with optimizations on call boundaries and in data disambiquity analysis.

TimP · ‎07-20-2011

N_1() is an automatic array. If you let it be allocated on stack (the fastest way), you will incur a failure if the stack size limit isn't large enough.
As Ron said, ALLOCATE in principle is the same, except that you could check explicitly for errors, which is advisable for an allocation this large.
Automatic array was not standard until Fortran 90, although certain f77 compilers supported it as an extension.

mecej4 · ‎07-20-2011

However at execution time, I get a segmentation fault if i1_max = i2_max = 2000..
It is not a problem of memory, the machine has plenty of it (12Gb).
On top of that, if I declare the size explicitly in the subroutine:
integer, dimension(2000,2000):: N_1
Then there is no problem either!!! Showing again that is no problem of memory.

Together, these statements establish that you do not have an adequate comprehension of the different ways in which memory is allocated and shared between programs on a modern multitasking operating system. You need to do some reading to understand the issues and make an intelligent selection that suits your purposes.

By trying to avoid dynamic allocation without thinking about the consequences you have invited failure.

A local array, as in your program, is usually allocated on the stack. Stack allocation for local variables, subprogram arguments and small arrays is convenient and efficient. That is why it is so commonly used. However, your array is not small.

The default stack size is operating system dependent, but is usually set to be a small fraction of the total virtually memory available, so that there is adequate stack for all the running processes under OS control. Your array requires 16 megabytes, which may be more than the default stack size on your OS.

A static declaration such as

integer, dimension(2000,2000):: N_1

allocates the array for duration of your program, so other subroutines and other tasks do not "own" and possibly cannot access that memory until your task completes. Such tying up of a resource is often undesirable.

Heap allocation, by using ALLOCATE, is the best fit for your requirements to the extent that your description defines them. It does require a few more lines of code but, as TimP has pointed out, you can check for successful allocation before attempting to use the array.

jpedregosa · ‎07-21-2011

I am impress!!! Thanks for bothering answering. Ok, I am an experimental physicist who does a bit of molecular symulation (not really a computer science person) and so I tend to get large matrices and vectors. When finally I decided to switch to fortran90 (I used fortran77) five years ago, I found many posts in forums about how ALLOCATABLE arrays affected performance. Ok, forums are not god's word, but what the hell we are in one and I trust that the answer are given in a honest manner!

Now that you guys have give me the key words, "heap allocation", I will try to find more information about it, and I will try to do some speed tests for my particular case about allocatable large arrays.

Also, do you know how can affect a linux system a change on the stack size?

However, there is still the mysteri that gfortran produced a code that executed without problems...

Thanks again for your help,
Jofre

Steven_L_Intel1 · ‎07-21-2011

If you want the same behavior as gfortran, add the switch -heap-arrays to the compile. gfortran always uses heap allocation for automatic arrays.

mecej4 · ‎07-22-2011

> I found many posts in forums about how ALLOCATABLE arrays affected performance

Judiciously used, ALLOCATE/DEALLOCATE will have an effect on performance that is probably too small to measure easily.

That is, if the numbers of executions of ALLOCATE/DEALLOCATE statements (and, in Fortran 2003, automatic allocation, and deallocation of local ALLOCATEd variables) are kept down to reasonable levels -- for example, by not doing premature deallocation of a variable/array before a subprogram RETURN when it is known that the next call to the subroutine will result in the same variable/array being reallocated -- these statements will be serving their intended purpose in the language, and quite efficiently so.

As with other newly introduced language features, ALLOCATE/DEALLOCATE may have been somewhat inefficient when they were first implemented. Perhaps, the reports that colored your opinion were based on those early compiler versions.

> Also, do you know how can affect a linux system a change on the stack size?

Try man ulimit .

John4 · ‎07-22-2011

Depending on the OS's age and the user's shell, man ulimit might point to somewhere else (e.g., the unimplemented or the ulimit.h man page). If bash is the user's default shell, ulimit is probably a built-in (like e.g., cd or pwd), so info --index-search=ulimit bash is a better option.

mecej4 · ‎07-23-2011

Good point. I have no experience with Ubuntu, so I gave a tentative pointer.

However, I have a vague recollection of at least some releases of Ubuntu using not bash but dash as the default shell.

A user of any version of Linux/Unix will need to be able to access the documentation for the shell and OS commands.

John4 · ‎07-24-2011

On Ubuntu, dash is the default for sh (so it affects any script for which the shebang line is #!/bin/sh), but scripts that use #!/bin/bash explicitly are not affected ---and the default shell for user accounts is still bash.

dash is supposed to be lighter an stricter than bash, so installing software is much faster ---one drawback, is that system() is usually implemented in terms of /bin/sh, so Fortran's EXECUTE_COMMAND_LINE will probably be affected by it as well.