- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have problems with a Fortran program reading contents from a named pipe in Linux. This can be experienced, for example, using awk:
ifort -O test_fifo.f90 -o test_fifo rm -f file.fifo mkfifo file.fifo awk 'BEGIN { line=0; \ while (1) { line = line + 1; \ printf "%d01", line; \ for (i=2;i<=17;i++) printf " %d%02d", line, i; \ for (i=1;i<= 7;i++) printf " %.2f", line+i/100.0; \ print ""; } }' > file.fifo & ./test_fifo
where named pipe file.fifo is created and awk is used to write lines of 17 integer and 7 float columns so that they can be detected. Program test_fifo.f90 simply reads the lines and checks the first and the last columns:
program test_fifo implicit none integer, parameter :: int_small = selected_int_kind(9) Integer, parameter :: real_low = selected_real_kind(p=6, r=30) integer(kind=int_small) :: scols(17) real(kind=real_low) :: fcols(7) character(len=*), parameter :: filename = 'file.fifo' integer :: unit = 11, line, io print*,'Opening file ', filename open(unit, file=filename, action='read', form='formatted', iostat=io) if (io /= 0) then print*,'Unable to open file ', filename stop end if line = 0 do while (.TRUE.) line = line + 1 ! Read 17 integer columns and 7 float columns: scols = 0 fcols = 0.0 read(unit,*,iostat=io) scols, fcols ! Check the first integer and the last float column: if ((scols(1) /= line*100 + 1) .OR. (fcols(7) /= line + 0.07)) then print*,'Mismatch on line ', line, ':' print*, scols, fcols end if ! Check for io errors (and eof): if (io /= 0) then print*,'Read returned ', io, ' on line ', line exit end if end do end program test_fifo
For majority of the input lines this works well but once in a while there is a corruption in the read:
Mismatch on line 67 : 6701 6702 6703 6704 6705 6 706 6707 6708 6709 6710 6711 6712 6713 6714 6715 6716 6717.000 67.01000 67.02000 67.03000 67.04000 67.05000 67.06000 Mismatch on line 99 : 9901 9902 9903 9904 9905 9906 9907 9908 9909 9910 9911 991 2 9913 9914 9915 9916 9917.000 99.01000 99.02000 99.03000 99.04000 99.05000 99.06000 Mismatch on line 208 : 20801 2 802 20803 20804 20805 20806 20807 20808 20809 20810 20811 20812 20813 20814 20815 20816 20817.00 208.0100 208.0200 208.0300 208.0400 208.0500 208.0600 ...
where contents of a column is clearly split into two separate numbers. This happens in three environments:
- CentOS 5.9, ifort 12.1.1.256 Build 20111011
- Ubuntu 12.04, ifort 13.1.2.183 Build 20130514
- CentOS 6.5, ifort 14.0.1.106 Build 20131008
With GNU Fortran (gfortran) the example is working correctly.
The problem seems to be related to the output buffering mode of awk. If the output is flushed after each line:
print ""; fflush(); } }' > file.fifo &
or if the buffering mode is changed with stdbuf to line-buffered:
stdbuf -oL awk 'BEGIN { line=0; \
there is no corruption. Unfortunately the program writing to the named pipe is user-given and stdbuf is not available for some of the target environments, so a Fortran side of a solution is desirable.
Is ifort working properly here/for you? Is there a compile time option or open/read parameter that can change the behavior?
Thanks,
Matti
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I reproduced the described behavior (the first mismatch appearing at iteration 452 for me) and avoidance altering the buffering and the gfortran behavior too. I was unable to understand the difference in behavior and will consult with our I/O developers for assistance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I submitted this to our I/O Developers (internal tracking id is noted below) for further analysis. One other note, I see the same mismatches you showed when running on a local disk. My earlier note about iteration 452 occurred when running on NFS. I will update again after I learn more.
(Internal tracking id: DPD200254427)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I think the issue is that you're reading from a FIFO, and by default Fortran IO is record based. What I think is happening is that the Fortran read statement is exhausting the contents of the FIFO before a newline character is printed, which is used as a record indicator for formatted files. My test on Mac showed that I see the same behaviour as you do, but that if I add the access='stream' keyword to the open statement the problem appears to go away. (At least it takes MUCH longer to encounter the issue with ifort, and I think the next issue is due to the record length being exceeded.) In general stream access makes your IO behave in a more c-like fashion.
I also wonder about read(lun,*) because the * edit descriptor is processor (compiler/machine/etc.) dependent, *I think* but I could be wrong.
Also, notice that the number of characters per line/record and per integer/float is increasing as the awk program runs. This might cause issues vis a vis the recl= rl specifier of the open statement. This specifies the maximum record length for a sequential formatted file, and is optional. If omitted it receives a default value which is processor dependent. If one performs an inquire(unit,recl=rec) on your original code (without stream access) it appears that ifort uses a maximum record length of 132 characters for formatted sequential access files by default. In gfortran this is listed as -1 which presumably means there is no maximum record length but it seems weird to list a value of -1 here and I wonder if that is standards conforming.
If stream access is specified on the open statement, the inquire(unit,recl=rec) returns a value of -1 for ifort, and 1 for gfortran. In my mind, positive 1 makes sense for the value here, because stream access will read in the stream one character at a time until it has finished performing the IO requested in the read statement. -1 as a recl for both ifort and gfortran seems strange to me, but I haven't checked it against the standard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Zaak - Thank you for your time investigating this issue and very insightful findings! I forwarded that to our Developers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The negative values for recl returned by both ifort and gfortran seems a bit wacky to me, but may very well be standard compliant, I'm not sure without looking it up.
I added some logic to look at the value of iostat=io after the read statement, and I think that, if indeed end of record is encountered because the fortran program is trying to read a full record (line) before it is available, then the standard dictates the value or io should be set to iostat_end or iostat_eor from iso_fortran_env. Adding some logic to examine iostat after the read seems to indicate that neither of these issues are ever signaled which mean either: 1) This is a bug in ifort or 2) Some other issue is the root of my problem other than my diagnosis.
Stream access does appear to at least improve the robustness of reading from a fifo, if not provide a complete fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's what the standard says:
6 9.10.2.26 RECL= specifier in the INQUIRE statement
7 1 The scalar-int-variable in the RECL= specifier is assigned the value of the record length of a connection for direct
8 access, or the value of the maximum record length of a connection for sequential access. If the connection is for
9 formatted input/output, the length is the number of characters for all records that contain only characters of
10 default kind. If the connection is for unformatted input/output, the length is measured in file storage units. If
11 there is no connection, or if the connection is for stream access, the scalar-int-variable becomes undefined.
In the case of "no maximum", I think HUGE(0) is probably a better choice than -1, but I can see the logic behind -1 since this is the value the language often requires for "don't know".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, that's basically what MFE says. Strange, so it seems that gfortran is in violation regarding recl. with sequential access here. I wonder if gfortran has a uint vs twos compliment issue here: -1 is the uint equivalent of huge(0) in twos compliment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am much more likely to think that one of the contributors decided that -1 meant "unknown" given that the standard lacks explicit wording. Come to think of it, this would make a reasonable interpretation request. I will see if it has come up before, and if not, propose one.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page