I use the latest Intel-Fortran-Compiler and I have a problem with the allocation of large arrays.
I’m developing code for computational fluid dynamics. There is a “main-program”, which calls several subroutines. The “main-program” is called as a subroutine too, which is called from a gui written with Winteracter.
I store all data in two large arrays, one real*8, one integer *4, size of several GB. The arrays are defined in a module. The problem is, where to allocate the two arrays.
When I allocate them in the “main-program”, the communication with the further called subroutines does not work, and this depends on choice of compiler-options like /o1 or /o2 and AVX2 or SSE or QxHOST and so on.
When I allocate the arrays in the calling GUI everything works.
I’m not shure, whether I make a mistake, so I do not rely on choosing the second way. I allocate the arrays before using them, so it should not make a difference where I allocate them.
Does anyone know this problem?
Many thanks and best regards,
Are you intending the arrays to be preserved between calls?
If "yes", either they have to be in a module as allocatable arrays .OR. in some source file with SAVE attribute.
Copy your GUI application project folder to a temporary folder and reduce the code down to a simple reproducer. The reproducer should be buildable with the allocation succeeds configuration and fails configuration.
Generally you keep the allocations, insert signature data, remove computations, assert on missing signatures. The result may be a few 10's of lines of code.
In the process of doing this you may discover an error in coding. If you cannot discover the error, at least then you can post the code here for others to see.
Please post your compiler version.
thank you, I will try that.
My compiler is
Intel parallel studio XE update 2 Composer edition, 2017.0.2.046
This is the module:
!!! Definitionen der Speicherfelder
INTEGER (KIND=4) :: IKurzesFormat,SIZEFR,SIZEFI
REAL (KIND=8), allocatable, Dimension(:) :: FR
INTEGER (KIND=4), allocatable, Dimension(:) :: FI
REAL (KIND=8) :: FEL(0:200)
END MODULE FDef
I store nearly everything in these two arrays like element-numbers, node-numbers, coordinates, large sparse matrices and so on. And I do openmp with these arrays with the shared-option, mainly dotproducts.
I use the compiler options /O2 and /QxCORE-AVX2 for all routines. With these computation speed is at best.
I found out: I must not use these for the "main-program". When I compile this with /O1 and /QxSSE2 (both are nessecary) it works, no more problems.
Could that be a compiler bug?
Could that be a compiler bug?
It could be that you have bugs in your code that corrupt data in memory. If you change compiler options you change the memory layout so not you my corrupt something that is not so important so that the bug is hidden.
Using all compiler checks such as check interfaces etc is a good starting point.
Still sounds like a programmer error.
a) are you using IMPLICIT NONE in all PROGRAM, FUNCTION and SUBROUTINE procedures?
b) Do you incorporate "USE FDef" in all procedures referencing FR and FI directly (as opposed to via DUMMY argument)?
Note, lack of IMPLICIT NONE combined with lack of USE FDef could cause the error to not show up until runtime.
You were right, I had a bug in the code. I had a not initialized integer variable, and depending on the compiler options it was set by the compiler to zero or not. So sometimes the code worked, sometimes not.
Instead of implicit none I use /warn:declarations.
Is there a way to find uninitiallized variables? I cant find that in the documentation.
Thank you for your effort and your patience.
>> So sometimes the code worked, sometimes not.
Where I come from, this is also known as "Walking on thin ice".
Bear in mind the "-check uninit" will catch most instances, but not necessarily all instances of uninitialized variables.
Going back to 1992-ish, Borland Turbo C (and later C++) had a feature that could catch all instances of uninitialized variables. This did have a runtime penalty but was acceptable when shaking out code. The feature used the paging hardware to trap access to "data" sections of the program. When a location was referenced, a trap would occur, and then the runtime system could consult/set flags regarding the state of the memory locations. Due to the overhead, this technique went out of favor, but it was nice to have when you needed it.
I think this could be re-implemented using a new yet-to-be developed tool derived from one of Intel's CPU simulator programs. This could be tied into the Intel Parallel Studio Advisor. Just a suggestion.