- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been cleaning up an old program written in Fortran 77, spanning several source files and containing 51 subroutines, 10 functions and the main program.
I used some utility programs to replace DO nnn i= type loops do DO i= ... END DO type loops, IF(expr)n1,n2,n3 to IF...ENDIF, etc., and make the program easier to follow and debug. I chose to have the utility output free form Fortran source files. I moved some subroutines from one source file into another, and that caused a lot of trouble and puzzlement.
I created makefiles for use with several different Fortran compilers. With some compilers, the build went fine and EXEs were built. With Ifort and Ifx, a strange thing happened. MAKE would decide that OBJ files were current, and issue a link command. The link command would fail with messages about unsatisfied externals, even though the missing subprograms were in the source files and the OBJs of those source files were processed by the linker. I tried deleting all OBJ files and running MAKE again, but no luck.
Then, I noticed that some of the OBJ files were suspiciously small. Poking into the symbols in those files revealed that only some of the subprograms in the pertinent source files had been compiled. After several trips down blind alleys, I found that these source files contained a CONTROL-Z character at the end of a comment line somewhere within the source file, rather than at the very end of the source file.
I thought we were done with CONTROL-Z serving as EOF marker when we moved from CPM-86 to MSDOS, but that is not so.
Here is a reproducer -- in the attached Zip file, you will find a single source file with 13 lines and 2 subroutines. In between the two subroutines is a comment line with a CONTROL-Z just before the LF at the end of line 7. Some editors can display "non-visible" characters such as CONTROL-Z, so I include a screenshot -- please see the "SUB" on line 7.
Intel Fortran (Ifort as well as Ifx) ignore the source lines that come after the CONTROL-Z, putting only the code for ASUB into the OBJ file. So do FPS-4, CVF6.6C, and NAG. On the other hand, Gfortran, Lahey-Fujitsu LF 7.1, Silverfrost FTN95 and Absoft ignore the CONTROL-Z, and compile both subroutines ASUB and BSUB.
I hope that the Intel compiler designers will consider this issue. If they continue to have the Intel Fortran compilers treat CONTROL-Z as an end-of-source marker, as a user I should appreciate a warning that source scanning was terminated prematurely because of the CONTROL-Z.
P.S.: Another oddity: if the '!' in Line 7, Column 1 is changed to 'c', 'C' or '*', a carryover from the indication of a comment line from fixed format Fortran, Ifort issues the following error message 30 times before abandoning the compilation. Each repetition has the position marker shifted one column to the right from the preceding error message.
ctrlz.f90(7): error #5078: Unrecognized token '|' skipped
c ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
--^
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Huh. I know that Intel Fortran still treats Ctrl-Z as an EOF marker in unformatted files but was unaware of it being recognized in source files. I wonder if this is something in the underlying C I/O system - I find it hard to believe that there is explicit code in the compiler for this!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FWIW, I agree with Steve, this may be a Wind
C:\test\ctrlz>dir
Volume in drive C has no label.
Volume Serial Number is F8CE-A4A4
Directory of C:\test\ctrlz
03/24/2021 04:39 PM <DIR> .
03/24/2021 04:39 PM <DIR> ..
03/24/2021 06:52 AM 249 ctrlz.f90
1 File(s) 249 bytes
2 Dir(s) 258,555,604,992 bytes free
C:\test\ctrlz>type ctrlz.f90
subroutine Asub(a,b,c)
implicit none
integer a,b,c
c = a+b
return
end subroutine
! ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
C:\test\ctrlz>
ows CRTL issue:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Jim.
Yes, type chops off the part of the file after the ^Z, but more shows the whole file, and the file size displayed by dir is for the complete file, as well!
If the section of code in the Fortran compiler that opens the source file, perhaps the second argument to the fopen CRTL STDIO routine, i.e., the flags argument, should have been "rb" rather than "r". On the other hand, I do not know if fopen is used rather than CreateFile.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mecej4 - this is a great lesson. We often do not know the insides and accept the black boxes as correct.
Well done. But then again your stuff is always good.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> perhaps the second argument to the fopen CRTL STDIO routine, i.e., the flags argument, should have been "rb" rather than "r".
Perhaps...
However, some users may use ^Z as a means to have a "fat source" file containing both program and data. And changing the behavior would cause issues for them. In the "old days", a card deck could contain program source cards followed by data cards.
There are other implementation issues as what to do with
0 NUL Null
1 SOH Start of Header
2 STX Start of Test
3 ETX End of Text
4 EOT End of Transmission
5 ENQ Enquiry
6 ACK Acknowledge
7 BEL Bell
8 BS Backspace
9 TAB Horizontal Tab
10 LF Linefeed (handled/system dependent)
11 VT Vertical Tab
12 FF Form Feed
13 CR Carriage Return (handled/system dependent)
14 SO Shift Out
15 SI Shift In
16 DLE Data Link Escape
17 DC1 Device Control 1
18 DC2 Device Control 2
19 DC3 Device Control 3
20 DC4 Device Control 4
21 NAK Negagive Acknowledge
22 SYN Synchronous Idle
23 ETB End of Transmission Block
24 CAN Cancel
25 EM End of medium
26 SUB Substitute
27 ESC Escape
28 FS File Separator
29 GS Group Separator
30 RS Record Separator
31 US Unit Separator
Then ?? 128:255 ???
All of the above are implementation defined.
Steve may have some input as to if CR and/or LF are implementation defined
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Fortran standard talks about the "Processor character set" - characters that may appear in source statements. (In standard-speak, "processor" is the "thing" that interprets your source code - substitute "compiler" generally, though it also encompasses the underlying OS and hardware.)
"Each character in a processor character set is either a control character or a graphic character. The set of graphic characters is further divided into letters (6.1.2), digits (6.1.3), underscore (6.1.4), special characters (6.1.5), and other characters (6.1.6)."
"Special characters" are things such as parentheses, equal sign, etc. The standard then goes on to say, for "Other characters", "Additional characters may be representable in the processor, but shall appear only in comments (6.3.2.3, 6.3.3.2), character constants (7.4.4), input/output records (12.2.2), and character string edit descriptors (13.3.2)."
Oddly, the only mention of "control characters" is in that first quote - it is never defined! (Something I will have to ask about.)
All this is to say is that the standard is silent about what any non-graphic character should mean or where it is permitted to appear. The standard is also silent on just how source lines are delivered to the processor, other than some handwaving in the description of INCLUDE.
Characters such as CR and LF, in some implementations, are used as line delimiters, and as such would be interpreted as separating source lines. (Not all platforms use control characters for this purpose.)
Keep in mind that the standard describes a standard-conforming program, and what the processor must do with it. If your source contains anything not given an interpretation by the standard, a processor is allowed to do anything it likes with it. In the end, you should not assume that a CTRL-Z in a source line is interpreted in any specific way, and it is best to not have such characters in your source file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Regarding "control characters", I am informed that these are defined in other standards referenced by the Fortran standard (ISO 10646 and ISO 646).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is what the DEC Fortran-77 manual says about control characters in source code. Note that special treatment is given to the Ctrl-L and Ctrl-Z characters.
The second sentence of the second paragraph puzzles me. Is it suggesting that a Fortran program is being used to write a new Fortran source file, using the ENDFILE statement in the writer program?
----------------------------------
Nonprintable Characters
The form-feed character (0C hex) is treated as a blank without causing a diagnostic message to be issued. In addition, a source record of length 1 containing a form-feed character causes the compilation source listing to begin a new page. A source record of length 1 containing a Ctrl-Z character (1A hex) is treated as a blank line. Such a record is created by the ENDFILE statement, if the command line option -vms is specified. All other control characters are valid, except 00(hex) and 01(hex).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a very interesting manual reference. The compiler it describes is not in the "lineage" of current Intel Fortran and it isn't related to VAX FORTRAN-77 either. I'm not sure exactly where it came from, but my memories of the RISC ULTRIX days are dim (I didn't work on that platform.) I faintly recall that in the early Alpha days that there was indeed an F77 compiler.
I do remember the single FF causing a new listing page. That text about ^Z and ENDFILE is odd, I agree, but it was not unusual to have programs writing other programs (it still happens.) Yes, on VMS at least, an ENDFILE record was a one-byte ^Z.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Many editors (such as Notepad++) permit you to display control characters using graphic icons that cannot be mistaken for normal characters.
- There are binary editors such as BVI and HXD.
- Command line utilities exist for the purpose. Look for programs with names such as "hexdump".
Cygwin includes a utility called hexdump. If I run hexdump on one of the problematic files, TIME3.FOR, I see at the end the following lines:
*
00000680 2a 0d 0a 0d 0a 20 20 20 20 20 20 72 65 61 6c 20 |*.... real |
00000690 66 75 6e 63 74 69 6f 6e 20 46 71 68 28 47 57 4c |function Fqh(GWL|
000006a0 2c 41 71 68 2c 42 71 68 29 0d 0a 20 20 20 20 20 |,Aqh,Bqh).. |
000006b0 20 46 71 68 3d 2d 41 71 68 2a 65 78 70 28 42 71 | Fqh=-Aqh*exp(Bq|
000006c0 68 2a 61 62 73 28 47 57 4c 29 29 0d 0a 20 20 20 |h*abs(GWL)).. |
000006d0 20 20 20 72 65 74 75 72 6e 0d 0a 20 20 20 20 20 | return.. |
000006e0 20 65 6e 64 0d 0a 0d 0a 2a 20 7c 7c 7c 7c 7c 7c | end....* |||||||
000006f0 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c 7c ||||||||||||||||||
*
00000730 1a |.|
00000731
If you already know that you want to look for a specific character such as CTRL-Z, you can use the following command
hexdump -C TIME3.FOR | grep -i " 1a"
and you will see the output
00000730 1a |.|
The optimization solver GAMS includes a hexdump utility which has a nice feature: it provides not only a hex dump, but also statistics on the contents of the file:
Characters read = 1841
Control Characters Used (0-31)
Dec Hex Cnt Description
10 0A 58 LF Line Feed
13 0D 58 CR Carriage Return
26 1A 1 SUB Substitute ^Z
Alternatively, you can use the following specific-purpose C program on a suspect file.
#include <stdio.h>
int main(int argc, char *argv[]){
int c,i, char_cnt[128]; FILE *fil;
if(argc != 2){
fprintf(stderr,"Usage: hexcnt <filename>\n");
exit(1);
}
for(i=0; i<128; i++)char_cnt[i]=0;
fil = fopen(argv[1],"rb");
while((c=fgetc(fil)) != EOF){
if(c < 128)char_cnt[c]++;
}
for(i=0; i<0x20; i++)
if(char_cnt[i] > 0)printf("%02X %8d\n",i,char_cnt[i]);
if(char_cnt[0x07F] > 0)printf("%02X %8d\n",i,char_cnt[0x07F]);
}
This C program, when compiled and run on the suspect file TIME3.FOR, outputs
S:\SWMS3D\dmp>hexcnt TIME3.FOR
0A 58
0D 58
1A 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can do it in Visual Studio too, though Microsoft makes you hunt for it.
- File > Open > File...
- Select your file
- Click the triangle to the right of Open, select Open With...
- Scroll the list towards the bottom and select "Binary Editor"
You can even change the values here. A few years ago, MS hinted that they wanted to drop this, but too many people complained.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have spent an interesting day playing with file formats. The SENSOR program puts out CSV files in a specific format. I have a Fortran program that takes them apart, mostly thanks to jim et al., but occasionally I make a mistake and open the file with EXCEL and the rather interesting MS program changes the time output by rounding, it means the file cannot be read.
My 14 year old daughter wants to know why read is not red, I said like Fortran the English language is hard to follow:
Case in point a sample Fortran program from
! ------------------------------------------------------
! Compute the area of a triangle using Heron's formula
! ------------------------------------------------------
PROGRAM HeronFormula
IMPLICIT NONE
REAL :: a, b, c ! three sides
REAL :: s ! half of perimeter
REAL :: Area ! triangle area
LOGICAL :: Cond_1, Cond_2 ! two logical conditions
READ(*,*) a, b, c
WRITE(*,*) "a = ", a
WRITE(*,*) "b = ", b
WRITE(*,*) "c = ", c
WRITE(*,*)
Cond_1 = (a > 0.) .AND. (b > 0.) .AND. (c > 0.0)
Cond_2 = (a + b > c) .AND. (a + c > b) .AND. (b + c > a)
IF (Cond_1 .AND. Cond_2) THEN
s = (a + b + c) / 2.0
Area = SQRT(s * (s - a) * (s - b) * (s - c))
WRITE(*,*) "Triangle area = ", Area
ELSE
WRITE(*,*) "ERROR: this is not a triangle!"
END IF
END PROGRAM HeronFormula
https://ourcodingclub.github.io/tutorials/fortran-intro/
This would be hard to follow for a new programmer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Humans are strange - they go out to collect the data files. All of the Fortran drawing programs are set up to handle 51 FFT's of 16384 time steps or about 8 minutes. Ok 8 minutes of standing is a long time.
You get files from 2 minutes, with the comment - I know you can fix it -- read I know you are a bit impatient to 4 minutes which is statistically just ok to an hour. Sorry I forgot the switch it off - read I hope Starbucks was nice.
In order to open the large files there is only one text editor I have seen that will do it -- VEDIT. Comes from Canada from about 1988. It also has great block editing features, but it adds a line to the end of a text file, so you end up with a blank line. I have been to lazy to tap the blank lines on a new program so I have to open the 8 minute file with notepad and check for the blankline -- it is not reliable in VEDIT.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
this shows after excel edits the time even saving as CSV
this shows original
this shows the end of the file in VEDIT - no line numbers no idea
this shows notepad++ you can see the spare line.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Caveat emptor!
It shows that MS Excel is not a text editor - it is very very keen on interpreting your data. That is a good thing, sometimes, but not always. And to illustrate a problem with that eagerness: "03-10-2021" may mean the 10th of march, 2021 but it is ambiguous, because in my part of the world it should be interpreted as the third of october, 2021. And I leave the possibilities for "03-10-07" as an exercise.
As for the extra line in notepad++: some text editors seem to consider the end-of-line characters LF/CR (in whatever combination or selection) as line separators, rather than as line endings. It has cost me many hours of grieve over the years ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am assuming that grieve is a misspelling for grief, but it only proves a point - not sure what the point is but, thanks for the comment. Dates are a beast.
PS: If you have never read the book The Mad Scientist Club published by Purple House - you will love it - give it to any 12 year old for xmas and see them learn science the easy way.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Oops, indeed, I meant "grief". I tried to make a few points, actually, but the common theme is that expectations may differ and will differ more the smarter the software tries to be.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page