After switching part of my code to use 64-bit reals as opposed to 32-bit reals, I have been experiencing crashes.
The traceback starts as follows:
forrtl: severe (157): Program Exception - access violation
Image PC Routine Line Source
libifcoremdd.dll 000007FEDE2F6AA6 Unknown Unknown Unknown
libifcoremdd.dll 000007FEDE3331DD Unknown Unknown Unknown
And then refers to a routine that is modelled by the below routine:
SUBROUTINE ReadAFile(cFile) CHARACTER(*), INTENT(IN) :: cFile CHARACTER(2048) :: cRemain INTEGER :: iStat OPEN(9, FILE = cFile, ACTION = 'READ', STATUS = 'OLD', IOSTAT = iStat) IF (iStat .NE. 0) RETURN DO READ(9, '(A)', IOSTAT = iStat) cRemain ! Traceback refers to this line of code IF (iStat .EQ. -1) EXIT IF (iStat .NE. 0) RETURN ! PRINT *, TRIM(cRemain) ! Do stuff with cRemain END DO CLOSE(9, IOSTAT = iStat) IF (iStat .NE. 0) RETURN END SUBROUTINE ReadAFile
The crash is only happening on the 64th call to this routine. All runtime checks are selected but no other error occurs. While investigating this I noticed that READ is not returning an IOSTAT of -1 at the end of the file when the crash occurs. By printing cRemain after the READ I get "0" for the blank line at the end of the file than gibberish for some extra lines (depends on compilation options 1 more line for 32-bit, 5 lines for 64-bit binary). I'm guessing that somehow the stack has been corrupted but I have can't see why this would be the case. The program crashes when compiled under Intel Visual Fortran Compiler 188.8.131.52 and Intel Visual Fortran Compiler XE 184.108.40.206; it does not crash when compiled with gfortran. I tried putting all arrays on the heap and boosting the stack size in the linker options but this did not help. Does anyone have any suggestions how this could be further debugged?
I'd guess that the problem is actually elsewhere in your program, where it is overwriting data that doesn't belong to it. Have you tried a more recent version of the compiler (18.0.2 is current)?
I am using Intel Visual Fortran Compiler 220.127.116.11 which is the latest. My command line is:
/nologo /debug:full /Od /heap-arrays0 /fpp /warn:declarations /warn:unused /warn:ignore_loc /warn:truncated_source /warn:uncalled /warn:interfaces /assume:byterecl /Qtrapuv /module:"x64\Debug\\" /object:"x64\Debug\\" /Fd"x64\Debug\vc140.pdb" /traceback /check:pointer /check:bounds /check:uninit /check:format /check:output_conversion /check:arg_temp_created /check:stack /libs:dll /threads /dbglibs /c
And no warnings are found. Everything is in modules so interfaces are enforced. It is strange as before moving to 64-bit REALs in one module the application had been working without any issue.
You have noted twice that the problems followed your changing 32-bit reals in your program to 64-bit reals, yet the code snippet that you presented has no reals of any size. Does it not seem reasonable, then, as Steve Lionel suggested, that the root cause is in some parts of the code that you have not shown -- parts that do something with real variables, rather than character strings as in the code snippet above?
Have you tried running with different options and/or building a 32-bit EXE instead?
>> It is strange as before moving to 64-bit REALs in one module the application had been working without any issue.
Does your code contain named commons where different compilation units, using the same named common, map the data differently? If so, changing the real type from 32-bit to 64-bit will change the byte offset from start of named common and therefor my clobber data in other same-named common.
I have tried a 32-bit executable (still breaks slightly different garbage lines) and dropping AVX targetting. I have no COMMON blocks in my code. I'm trying to reduce the code while still reproducing the problem, just calling the routine where it crashes 64 times is not sufficient, it needs to be embedded in a wider application. Thank you for all the suggestions so far.
After further testing it seems that the switch from 32-bit to 64-bit reals caused Release builds to break using Intel Visual Fortran Compiler XE 18.104.22.168. Even without this change Release builds using Intel Visual Fortran Compiler 22.214.171.124 and Debug builds using either compiler break (with particular inputs). So I still need to narrow this down to work out why this behaviour is happening.
I have further reduced the code and now have a 64 line program that reproduces the error:
PROGRAM CrashTest IMPLICIT NONE CHARACTER(*), PARAMETER :: cFile1 = & 'Path omitted' CHARACTER(*), PARAMETER :: cFile2 = & 'Path omitted' CALL ReadAFile(cFile1) CALL ReadAFile(cFile2) CONTAINS SUBROUTINE ReadAFile(cFile) Character(*), INTENT(IN) :: cFile REAL :: r1, r2 Integer :: iStat, iUnit, i, iPos LOGICAL :: lProcessed Character(256) :: cValue Character(2048) :: cRemain lProcessed = .FALSE. cRemain = '' iUnit = 9 Open(iUnit, FILE = cFile, ACTION = 'READ', STATUS = 'OLD', IOSTAT = iStat) If (iStat .NE. 0) RETURN Read(iUnit, '(A)', IOSTAT = iStat) cRemain IF (iStat .NE. 0) THEN CLOSE(iUnit, IOSTAT = iStat) RETURN END IF Do While (.TRUE.) cRemain = ADJUSTL(cRemain) iPos = INDEX(cRemain, ' ') cValue = ADJUSTL(cRemain(1:iPos-1)) cRemain = cRemain(iPos+1:) If (cValue(1:LEN('VERTICAL')).EQ.'VERTICAL') Then IF (TRIM(cFile) .EQ. cFile2) Then DO i = 1, 4 READ(iUnit,*, IOSTAT = iStat) r1, r2 IF (iStat .NE. 0) RETURN END DO lProcessed = .TRUE. END IF Endif Read(iUnit, '(A)', IOSTAT = iStat)cRemain If (iStat .NE. 0) EXIT IF (lProcessed) PRINT *, "Near End of file: ", TRIM(cRemain) Enddo Close(iUnit, IOSTAT = iStat) END SUBROUTINE ReadAFile END PROGRAM CrashTest
I'm investigating what is special about the input files. But the content of the file referenced by cFile2 is important, that of cFile1 is not so important (I could use any file of the same format, except the file referered to by cFile2 including a duplicate copy with a different name). I will work on reducing the input files so that I can share something here.
I have reduced the two files, mostly by redacting the content by overwriting with "A"s as the file length seems important. So code now has:
CHARACTER(*), PARAMETER :: cFile1 = & 'C:\Test\crashfile1.txt' CHARACTER(*), PARAMETER :: cFile2 = & 'C:\Test\crashfile2.txt'
I attach the two files that reproduce the crash. It would be good if someone else could test this. I assume they are not munged upon uploading to check here are the md5sums:
@mecej4: Thank you for reproducing this.
@andrew_4619: I've not tried it as an external subroutine, originally it lived in a separate module though. I've submitted it to the Intel service centre as request 03393555.