- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to read the contents of a file into an array that stores each line for subsequent processing. In order to minimize memory, I am trying to use an allocatable array of deferred-length character variables. I get an internal compiler errror in the part of the code doing the processing. The work-around is fairly easy, but is there a better approach?
A stripped-down example follows.
implicit none
integer :: n=0, m=0, i, istat
character (len=512) :: string
character (len=:), allocatable, dimension(:) :: string_array
open(1,file='input.txt')
do
read(1,'(A)',end=99) string
n = MAX(n,LEN_TRIM(string))
m = m+1
enddo
99 continue
allocate(character(n)::string_array(m), stat=istat)
rewind(1)
do i=1,m
read(1,'(A)') string
string_array(i) = string
enddo
do i=1,m
! compiler error on this line
! if (string_array(i)(1:1) == '|') print *, string_array(i)
! workaround
string = string_array(i)
if (string(1:1) /= '|') print *, string_array(i)
enddo
stop
end
A stripped-down example follows.
implicit none
integer :: n=0, m=0, i, istat
character (len=512) :: string
character (len=:), allocatable, dimension(:) :: string_array
open(1,file='input.txt')
do
read(1,'(A)',end=99) string
n = MAX(n,LEN_TRIM(string))
m = m+1
enddo
99 continue
allocate(character(n)::string_array(m), stat=istat)
rewind(1)
do i=1,m
read(1,'(A)') string
string_array(i) = string
enddo
do i=1,m
! compiler error on this line
! if (string_array(i)(1:1) == '|') print *, string_array(i)
! workaround
string = string_array(i)
if (string(1:1) /= '|') print *, string_array(i)
enddo
stop
end
Link Copied
11 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I assume you would really like each element of the string to be the actual length of the line, rather than all the maximum length. If so, you need a derived type to hold each string, like this:
[plain] implicit none integer :: n=0, m=0, i, istat character (len=512) :: string type string_array_type character (len=:), allocatable :: string end type string_array_type type(string_array_type), allocatable, dimension(:) :: string_array open(1,file='input.txt') do read(1,'(A)',end=99) string m = m+1 enddo 99 continue allocate(string_array(m), stat=istat) rewind(1) do i=1,m read(1,'(Q,A)') n,string string_array(i)%string = string(1:n) enddo do i=1,m ! compiler error on this line if (string_array(i)%string(1:1) == '|') print *, string_array(i)%string enddo stop end[/plain] This compiles. The internal compiler error is already fixed for a release later this year.
[plain] implicit none integer :: n=0, m=0, i, istat character (len=512) :: string type string_array_type character (len=:), allocatable :: string end type string_array_type type(string_array_type), allocatable, dimension(:) :: string_array open(1,file='input.txt') do read(1,'(A)',end=99) string m = m+1 enddo 99 continue allocate(string_array(m), stat=istat) rewind(1) do i=1,m read(1,'(Q,A)') n,string string_array(i)%string = string(1:n) enddo do i=1,m ! compiler error on this line if (string_array(i)%string(1:1) == '|') print *, string_array(i)%string enddo stop end[/plain] This compiles. The internal compiler error is already fixed for a release later this year.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Steve.
Yes, the derived type would be much better. I means I have to change all occurances of "string_array(...)" with "string_array(...)%string", but I think I can handle that.
I did have a problem compiling the code, though. I'm not sure what's wrong, since the variable is declared on line 8.
Compiling with Intel Visual Fortran Compiler XE 12.1.1.258 [IA-32]...
Source1.for
Source1.for(19): error #6404: This name does not have a type, and must have an explicit type. [STRING_ARRAY]
Source1.for(24): error #6458: This name must be the name of a variable with a derived type (structure type) [STRING_ARRAY]
Source1.for(24): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands. [STRING]
Source1.for(30): error #6837: The leftmost part-ref in a data-ref can not be a function reference. [STRING_ARRAY]
Source1.for(30): error #6158: The structure-name is invalid or is missing. [STRING_ARRAY]
Yes, the derived type would be much better. I means I have to change all occurances of "string_array(...)" with "string_array(...)%string", but I think I can handle that.
I did have a problem compiling the code, though. I'm not sure what's wrong, since the variable is declared on line 8.
Compiling with Intel Visual Fortran Compiler XE 12.1.1.258 [IA-32]...
Source1.for
Source1.for(19): error #6404: This name does not have a type, and must have an explicit type. [STRING_ARRAY]
Source1.for(24): error #6458: This name must be the name of a variable with a derived type (structure type) [STRING_ARRAY]
Source1.for(24): error #6303: The assignment operation or the binary expression operation is invalid for the data types of the two operands. [STRING]
Source1.for(30): error #6837: The leftmost part-ref in a data-ref can not be a function reference. [STRING_ARRAY]
Source1.for(30): error #6158: The structure-name is invalid or is missing. [STRING_ARRAY]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I compiled the code I posted successfully with 12.1.3 (Update 9).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your errors are collectively a good example of why fixed form source is evil.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Doh! My source is free-form - .f90.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I took care of the problem with the line length in the fixed format.
Two more questions:
1. Is the memory allocated on the heap? I don't want run into stack overflows for really big input files.
2. Is the 'Q' edit descriptor an Intel Fortran extension or a new 2003 Fortran feature?
Two more questions:
1. Is the memory allocated on the heap? I don't want run into stack overflows for really big input files.
2. Is the 'Q' edit descriptor an Intel Fortran extension or a new 2003 Fortran feature?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The memory is on the heap. Q is an extension. If this matters, I could show you how to do it with standard Fortran. Q is easier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve - your code does not address the problem of a file of arbitrary length and arbitrary width - it is limited to line widths of 512 characters or less. Reading arbitrary width lines is quite awkward to do in standard Fortran but the Q edit descriptor helps a lot. So you can use the initial loop to get both the length and maximum line width and then re-read appropriately
[bash] implicit none
integer :: n=0, m=0, i, istat, maxwidth=0
character (len=:), allocatable :: line
type string_array_type
character (len=:), allocatable :: string
end type string_array_type
type(string_array_type), allocatable, dimension(:) :: string_array
open(1,file='input.txt')
do
read(1,'(Q)',end=99) n
m = m+1
maxwidth = MAX(maxwidth,n)
enddo
99 continue
print *, "m = ", m
print *, "maxwidth = ", maxwidth
allocate(string_array(m), stat=istat)
allocate(character(len=maxwidth):: line, stat=istat)
rewind(1)
do i=1,m
read(1,'(Q,A)') n,line
string_array(i)%string = line(1:n)
enddo
do i=1,m
! compiler error on this line
!! if (string_array(i)%string(1:1) == '|') print *, string_array(i)%string
write (*,'(i5,": ",A)') i, trim(string_array(i)%string)
enddo
stop
end[/bash]
Incidentally, I find the lack of automatic reallocation of deferred length allocated scalars in i/o statements a considerable inconvenience, both for ordinary read's and internal read/write's. I know this has been brought up a number of times here and elsewhere. I assume that there is some gottcha lurking there somewhere that dissuaded the standards committee from implementing this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To the best of my knowledge, the notion never came up before the committee. It does seem obvious in hindsight. Adding it would change the meaning of existing programs, but F2003 did that already with automatic reallocation in assignment.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting David Kinniburgh
Steve - your code does not address the problem of a file of arbitrary length and arbitrary width - it is limited to line widths of 512 characters or less. Reading arbitrary width lines is quite awkward to do in standard Fortran...
Did you want to read the lines of the file into a rectangular buffer (in which case there's no need for the derived type) or into a series of buffers that each have the length of the relevant line?
If it is the former, then yes - you should probably use two passes, one to scan the file to work out the maximum length and number of lines, the second to do the actual read.
If it is the latter, then because IO is (typically) slow a little bit of dynamic memory reallocation isn't going to hurt. To read a single line, use something like the following as a module procedure:
[fortran] !***************************************************************************** !! !> Reads a complete line (end-of-record terminated) from a file. !! !! @param[in] unit Logical unit connected for formatted !! input to the file. !! !! @param[out] line The line read. !! !! @param[out] stat Error code, positive on error, !! IOSTAT_END on end of file. SUBROUTINE get_line(unit, line, stat) USE, INTRINSIC :: ISO_FORTRAN_ENV, ONLY: IOSTAT_EOR !--------------------------------------------------------------------------- ! Arguments INTEGER, INTENT(IN) :: unit CHARACTER(:), INTENT(OUT), ALLOCATABLE :: line INTEGER, INTENT(OUT) :: stat !--------------------------------------------------------------------------- ! Local variables ! Buffer to read the line (or partial line). CHARACTER(256) :: buffer INTEGER :: size ! Number of characters read from the file. !*************************************************************************** line = '' DO READ (unit, "(A)", ADVANCE='NO', IOSTAT=stat, SIZE=size) buffer IF (stat > 0) RETURN line = line // buffer(:size) IF (stat < 0) THEN IF (stat == IOSTAT_EOR) stat = 0 RETURN END IF END DO END SUBROUTINE get_line [/fortran]You only have to write the above once (for each character kind). If you know that your line lengths are often going to be greater than 256 then you can increase the buffer size to reduce the number of reallocations.
You can use a similar dynamic reallocation approach to handle the number of lines, though I'd suggest a doubling buffer type of approach because the amount of data that will otherwise be copied around might bcome excessive (initally allocate the TYPE(string_array_type) array to some decent size, start reading in lines, tracking how many elements are in use, when the array is full allocate a new array that is double the size of the old one, copy over the existing elements, carry on reading, when you are done either do a final reallocation and copy to chop the buffer down to size(*), or just work with some spare elements at the end of the array).
Alternatively, do a pass through the file just counting lines ( READ(unit, "()") ) and then allocate the size of the array.
Either way, these approaches are similar to what would be used in many other languages, I don't think standard Fortran is at any particular disadvantage here.
(*) array = array(:elements_in_use) - it would be nice if the compiler recognised this pattern and optimised the assignment to be a simple update of the descriptor...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Ian. That's nice and clear.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page