Do you have to use stream

OP1 · ‎09-23-2016

This simple code

PROGRAM P
IMPLICIT NONE
INTEGER :: I,A(2),B(2),C(2)
CHARACTER(LEN=15) :: E(2)
CHARACTER(LEN=:),ALLOCATABLE :: STRING
STRING = '1 2 3 ABCD' // ACHAR(13) // ACHAR(10) // '4 5 6 DEFG' // ACHAR(13) // ACHAR(10)
READ(STRING,*) (A(I),B(I),C(I),E(I),I=1,2)
WRITE(*,*) A
WRITE(*,*) B
WRITE(*,*) C
WRITE(*,*) E
END PROGRAM P

produces this output:

           1           4
           2           5
           3           6
          DEFG
Press any key to continue . . .

I am a bit at a loss to explain what happens to E(1) ...

Steven_L_Intel1 · ‎09-23-2016

You don't use CR-LF to do new records in internal reads. Instead you use a character array - each element is one record.

What happened is that E(1) is still there, but has CR at the end. When you write this to the terminal, the ABCD gets overwritten with the value of E(2). I guess CR is not considered a separator but LF is, in our implementation.

OP1 · ‎09-23-2016

Thanks Steve.

The reason for the CR+LF characters is that this string corresponds to an external file that was loaded as a single string of characters (using stream access). I am not so sure about how to implement the character array in this case - since each line of data (of the external file) may have different length. Each line contains the same fields (same data, same width), with the exception of the last one (which may have trailing blanks, or contain character variables of various length).

Would you have an example you could point me to?

Steven_L_Intel1 · ‎09-23-2016

Do you have to use stream access to read the file? It would be so much simpler if you read it normally. Otherwise, since you know the "record" length, you could do the internal read from a substring whose starting and ending positions advance with each "record". You'd need special case code for the last one.

andrew_4619 · ‎09-23-2016

Given your read actually works, is is only oiputing strings with embedded control chars that is causing you a problem why no just strip them out.... e.g.

module fred
    implicit none
    contains
    subroutine string_striper(A)
        character(len=*) :: A(:)
        integer          :: l1, l2
        do l1 = 1, size(A)
            do l2=1,len(A(1))
                if    ( A(l1)(l2:l2)==achar(10) ) then
                   A(l1)(l2:l2)=' '
                elseif( A(l1)(l2:l2)==achar(13) ) then
                   A(l1)(l2:l2)=' '
                endif
            enddo
        enddo
    end subroutine string_striper
end module fred
PROGRAM P
    use fred
    IMPLICIT NONE
    INTEGER :: I,A(2),B(2),C(2)
    CHARACTER(LEN=15) :: E(2)
    CHARACTER(LEN=:),ALLOCATABLE :: STRING
    STRING = '1 2 3 ABCD' // ACHAR(13) // ACHAR(10) // '4 5 6 DEFG' // ACHAR(13) // ACHAR(10)
    READ(STRING,*) (A(I),B(I),C(I),E(I),I=1,2)
    call string_striper(E)
    WRITE(*,*) A
    WRITE(*,*) B
    WRITE(*,*) C
    WRITE(*,*) E
END PROGRAM P

IanH · ‎09-23-2016

OP wrote:
The reason for the CR+LF characters is that this string corresponds to an external file that was loaded as a single string of characters (using stream access). I am not so sure about how to implement the character array in this case - since each line of data (of the external file) may have different length. Each line contains the same fields (same data, same width), with the exception of the last one (which may have trailing blanks, or contain character variables of various length).

I don't quite follow the problem, but note that for sane Fortran compilers on Windows, formatted stream input (which is only applicable to input from an external file) uses CR-LF as a record separator - that sequence in a formatted file should logically be treated as a single NEW_LINE character, and the NEW_LINE character delimits records in formatted stream i/o.

If I was reading multiple lines of different length from a file for later processing, I would read it into an array of a derived type like:

TYPE :: String
  CHARACTER(:), ALLOCATABLE :: Item
END TYPE String

where each element in the array represented a line (record). For the later line-by-line processing, just loop over the elements in the array.

List directed formatting always makes me nervous.

OP1 · ‎09-26-2016

Thanks to all for the answers.

The files to be read are huge (millions of lines), and there are many of them (thousands), so performance is an issue (the 'normal' use of the code is to use binary input data - orders of magnitude faster - BUT we also need a reasonably fast capability for formatted files as well).

The current strategy is to upload them at once (stream access), which is faster than a traditional READ statement for formatted data; then use multiple OpenMP threads to do the parsing of the fields on each line (being delimited by the CR+LF or LF characters, depending on the platform) with internal READ statements after breaking up each 'line' into its component fields.

For each line of data, the fields width have a fixed length; except the case of the last field if it contains character data (in which case this field has a maximum allowable width, although in practice the data is usually much shorter than the max allowed width).

Puzzling internal read behavior