The code I am working on is running fine on Windows. When it is ported and built on Linux, it fails on the on eof our unit tests with an error – cannot write a file to working directory - in a directory that is open to write in.
Running the same Linux executable with strace, I get "too many open files" error trying to write the file.
Changing all buffered IO open statements from BUFFERCOUNT=4 to =1, now it runs with no errors. Is there any known issue with buffered IO on Linux as opposed to WIndows?
The number of physical buffers and too many open files and/or cannot write to a working directory don’t seem related to me. Maybe there’s a shell limit being exhausted leading to the cannot open and files and too many open files errors.
What version of the compiler are you using?
How many files are opened at one time? Are files also being closed during execution?
Can you provide a complete runnable reproducer?
I am using ifort 17.1 on Windows and on Linux and the intel C compiler same version. Code is almost all fortran except for a C xml parser. Only a handful of files open at once. Put in a check that any logical unit opened had been closed. Tried check=all and undefined initialization debug options. Too many files message shows up with strace; without strace, the code fails to open (in C) a file to write to in working directory (writeable).
Is this application Fortran (has PROGRAM entrypoint) or C (has main entrypoint)?
Is threading involved?
If yes to C main, and threading, then does the C main instantiate threads that call the Fortran subroutines (that are further parallelized)?
Don't you really love those bugs that come and go when you add a printf... OK, so it's not buffered IO in Fortran. Intel C compiler though, with -O0 optimization.
You've got I/O going on in both Fortran and C, correct? You need to be careful not to try and be accessing the same file and/or the same units. The runtime libraries for the two languages do not share data, and in particular, don't share I/O "infrastructure" details.
Yes, there is IO in both Fortran and C, but only to a distinct set of files/units. the C code manages all IO to xml files only. Fortran never touches them (xml) directly. logical units or file handles are never shared between Fortran and C. At least for 2 tries each, I can get the test case to run all by itself, but when I run a set of 30 cases, the same test case always (2 times) fails.
In some of the situations where I experienced crash going away when inserting PRINT or other statements to locate the error. (also known as Heisenbug), the error usually was manifested by program writing beyond array bounds or to uninitialized reference. Have you run the full compiler diagnostics and runtime diagnostics to check for call with incorrect interface and/or subscript out of bounds errors? That is a good first start (prior to diagnostic PRINT statements).
Thanks, yes I have run check=all and all the undefined initialization (so glad to see this feature since the CRAY-1 had it!!) on ~600 test cases on both windows and Linux. Wish Rational Software Purify still supported Fortran. Heisenburg would be proud as I consistently get the code to "pass" if I run a single job, but consistently fail when I run ~30 different jobs (test cases) on multiple Linux servers all directed to the same file system. Am I having fun yet?
A bright young guy found the problem. I was calling GETFILEINFOQQ with a fixed string (name of one file, no wildcards) to get the file timestamp. I neglected to continue calling GETFILEINFOQQ until I got the last file handle, since I just would see one file. In the documentation fine print:
GETFILEINFOQQ must be called with the handle until GETFILEINFOQQ sets handle to FILE$LAST, or system resources may be lost.
Symptoms were too many files open, or unable to open a file to write to, or unable to open a readable file. The API for this function is not the best, not that I didn't screw up. Would be a kindness to have a clean function to get file timestamps, or if calling GETFILEINFOQQ with a fixed string filename just released its stuff since you can't have more than one file with the same name.
Question for intel Fortran developers:
Can getfileinfoQQ be used in a recursive function? Is there a difference in its behavior on Windows versus Linux?
The goal is to write a Fortran function to recursively delete directories with a possible wildcard filename at the top level.
My code seems to be running on WIndows, but only deletes 1 of 3 subdirectories at a certain subdirectory level.
A sub directory will (can) not be deleted if a file in the directory is in use. Sometimes this also includes if a Windows Explorer or file browse window (Dialog) is looking at the directory. Additionally, you may have a hidden file in there.
Does an error status return when deleting the directory?
The recursive delete I have at this point works on Windows, but not on Linux, a recursive GETFILEINFOQQ at a lower recursion level seems to corrupt the GETFILEINFOQQ at the higher recursion level (on Linux but not on Windows). No files should be open since this process is run as part of our test suite that creates new directories and then runs the test, all in batch. It does not seem to fail to delete the directory using DELDIRQQ. It seems to forget that at the upper recursion level GETFILEINFOQQ found 3 files to delete and only deleted the first.
BTW, excel is notorious for locking a file it has open on windows, like .csv files.
Make sure that you have declared your routine recursive and that you're using a local variable for the "handle" argument. It ought to work.
ifort compiler options: -noauto
export FFLAGS = -c -I ../Includes -DLINUX -assume nounderscore -names lowercase -noauto \
-fp-model=precise -fp-speculation=strict -fpconstant -traceback -O0 -check all -debug full
recursive logical function ac_del(filename)
character(len=*), intent(in) :: filename
! local variables
integer :: ierr, result
integer(kind=2) :: count
character(len=SIZEOFFILENAME) :: filestr, subdir
TYPE (FILE$INFO) :: info
INTEGER(KIND=INT_PTR_KIND( )) :: handle
character(len=8) :: rout = 'ac_del: '
logical :: local_debug = .true., file_exists, delresult,wildcard
are info and handle auto or static? on both windows and Linux ifort 17.1
Scalar variables of intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL are allocated to the run-time stack. Note that the default changes to auto if one of the following options are specified:
does the recursive in the function definition make local variables auto instead of static?
Without recursive (or other options that implicitly set recursive)
character(len=SIZEOFFILENAME) :: filestr, subdir
may default to SAVE (similar behavior to arrays).
Using the implicit SAVE filestr, subdir would be safe to use as long as upon popping up a recursion level that you clear out any left over tail-end text appended to the buffer
lenBefore = LENTRIM(subdir)
delresult = ac_del(subdir)
if(lenBefore < LEN(subdir)) subdir(lenBefore+1:) = ' '
Adding AUTOMATIC attribute to the local variables in the recursive delete function did not change the functions behavior. It works on Windows but on Linux, it fails to delete two of the three subdirectories in a directory selected for recursive delete. I could provide the function if intel would like to test it.
Yes, if you can share a small but complete reproducer with us that will help us investigate if we have an issue on Linux with this routine.