I am porting a Fortran program from Linux/CentOS-7 to Windows-10
On Linux with Intel 2019, it compiles and runs just fine.
On Windows with Intel oneAPI 2021.3, it compiles, but immediately reports stack overflow as it attempts to run.
Can't even tell where the problem is.
> program.exe input.txt forrtl: severe (170): Program Exception - stack overflow Image PC Routine Line Source program.exe 00007FF62757ACD8 Unknown Unknown Unknown program.exe 00007FF6272953F3 MAIN__ 17 program.F program.exe 00007FF62757A4BE Unknown Unknown Unknown program.exe 00007FF62757AEE4 Unknown Unknown Unknown KERNEL32.DLL 00007FFC87CB7034 Unknown Unknown Unknown ntdll.dll 00007FFC88102651 Unknown Unknown Unknown
-fpp /convert:big_endian /check:bounds /warn:noalignments -traceback /fpe:0 /Qinit:zero
Added /heap-arrays:0 /check:stack, nothing changed.
Added /debug:all, nothing changed, same limited hint above.
Added /F50000000, nothing changed.
Any other pointers?
The traceback shows line numbers only for one routine.
Is your program spread over several source files, and all they all compiled with the debug options that you listed? Were any libraries devoid of traceback information linked to produce the EXE?
It would help to view at least the first 17 lines of program.F.
Verwenden Sie auf beiden dieselbe FORTRAN-Version?
Die openAPI UP und läuft.
Haben Sie versucht, Fiddler auszuführen, um zu überprüfen, ob Sie die openAPI manuell testen können?
Program consists of over 170 individual source files, a couple of dozens first put into a handful of libraries, the rest are just part of what becomes the "main" executable. They are all built with the same compiler options.
Line 17 is the very very very first line, the one that says "program <program>" in file program.F; the previous 16 lines are just comments and descriptions, etc.
That's just it...I can't get traceback to tell me more.
As mentioned above, fortran versions are not the same; on Linux:2019, on Windows oneAPI 2021.3
I don't know what fiddler is...some kind of debugger, I presume.
I did look and that is where I learned about additional compiler options like heap-arrays
...I guess I should probably learn how to use a debugger
OneAPI provides two different Fortran compilers: the "classic" Ifort and the new Ifx. I hope that you are using only Ifort for all your compilations, since Ifx is not quite suited for general production use yet.
I cannot think of any easy ways of diagnosing the problem. Here are a couple of suggestions to try.
Use /Od /traceback as the only compiler options and rebuild the libraries and the EXE. Does running the EXE still produce stack overflow?
If yes, place a STOP statement as the first executable statement in the main program. Try again. If the stack overflow is still present, examine the program listing and the linker map to find out which local variables cause the problem. You can also remove the rest of the executable statements in the main program and all the subroutines that are no longer needed (any CALL that comes after the STOP, in logical order, is not needed). After a few iterations of this, you may have a "bug reproducer" that you are able to provide for Intel to examine and act upon.
I wonder if l4t3nc1 wrote "openAPI" in place of "oneAPI".
If line 17 is PROGRAM, then this does tell you something useful. Did you start out with that /F or did you add it later? Very large values of stack reserve can trigger other problems. On stack overflow, the first traceback line will be in an error reporting routine - it isn't important.
One thing I would try is to boot Windows into Safe Mode and try running the program - does it still fail?
How many of those compile options can you remove before the stack error goes away? (You can keep -fpp I suppose.)
I don't know what Fiddler/web-traffic has to do with my Fortran program; or, if it is a web-based debugger, sorry, but I am not about to upload company sources. Oh, just googled, I guess you mean Fiddle ? It looks like an online debugger, maybe.
Anyway, thanks for the other pointers, I will try and report later.
As these posts seem to show, you have a hard road to hoe. These are never easy, I think the main ideas from the real experts, and I am not one, is to drag out all that you do not need and slowly build the program.
I did that today with an old F90 code from 91, it had a lot of quirks and would not compile. Pull it down to a few lines and work slowly outwards. It has taken all day, and it was spread across about 8 files, but it is now working. Some times you can only add one line or one function.
When they ask for a reproducer, they do not want your million lines of code, they want say 10 lines that show the problem.
While I note that your program is built as x64 on Windows.....
You should be aware that the static data area (variables with and/or without initialization) CANNOT exceed 2GB.
This is a limitation of the linker object file format. To confuse you even more, while most of the time the linker will report this situation as an error, sometimes you get no warning at all. And this then results in befuddling errors prior to program startup. This appears to be symptomatic of your situation.
The correction for this (in Fortran) is to make the (very) large unitialized arrays ALLOCATABLE then allocate at start of program (e.g. in subroutine possibly named InitArrays). For (very) large initialized arrays (DATA and/or =[...]) you may need to pull this in from a file.
"very large" == 100's of MB.
Thank you very much.
Please correct me if I was wrong below.
I usually in the linker, enable largeaddressware, and I have run program with arrays much bigger than 2GB and it seems work fine, see below,
The makefile using Intel Fortran is below. Note that in the linker option, I enabled largeaddressware.
EXEC = test.exe
FC = ifort.exe
LINKER = /link
FFLAGS=/nologo /MP /O3 /QxHost /assume:buffered_io /heap-arrays0 /Qipo /libs:static /threads /Qmkl:cluster
F77FLAGS=$(FFLAGS) -fdefault-real-8 -fdefault-double-8 # gfortran only.
.SUFFIXES: .obj .f .f90
$(FC) $(FFLAGS) /c $<
$(FC) /exe:$(EXEC) $(OBJECTS) $(LIBS) $(LINKER) /out:$(EXEC) $(LDFLAGS)
@del /q /f $(EXEC) *.mod *.obj *~ > nul 2> nul
# not that in windows rm -f does not work, so use del instead.
# > nul 2> nul just to suppress some redundunt mesage.
EM_mix.obj: ran.obj samplers.obj EM_mix.f90
$(FC) $(FFLAGS) /c EM_mix.f90
$(FC) $(FFLAGS) /c ran.f90
samplers.obj: ran.obj samplers.f90
$(FC) $(FFLAGS) /c samplers.f90
Well, my program seems to have originated back in 1996 and it is a combination of fix and free format files, commmon blocks and modules...a bit of a mutt.
Because the stack overflow message was not very telling even when using debug options, I ended up doing as mentioned above: remove everything, bring a few lines at a time.
Aside from the more than 150 individual files, thankfully, the main program was only 1200 lines; so, after bringing a few hundred lines at a time, the stack overflow message went from line 1 in MAIN to line 1 in SOME_SUBROUTINE...where I quickly noticed some strings where set to 240,000 characters long, even though they were meant to store data like user, date, time, hostname, command line and optinos. Who knows what prompted somebody to use such lengths. Anyway, reducing those numbers to some sensible ones solved my problem and I don't even need to use "heap-array" option.
Thanks, everybody, for all the hints and comments.
Large Address Aware provides for a maximum of 3GB (user program) address space on 32-bit system (remainder left for O/S within user VM). There still was a Linker limit of 2GB for any "segment" (.text, .data, .bss, ...).
Make your large arrays allocatable.