Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28385 Discussions

Is there a way to store the current position in a sequential, formatted file, for future retrieval?

OP1
New Contributor II
647 Views

I am afraid the answer is no... but I'd love to be proven wrong!

Assume a sequential, formatted file has been opened.
A series of READ statements is performed, to search for a particular string on one of the lines of the file. Each READ statement reads a full record (line), and the cumulative number of READ statements is tracked in a counter (starting at 0 at the beginning of the file).

When the searched string is found, is it possible to 'store' the corresponding record (line) information, so that if one needs to go back to that location of the file at a later time, it could be done instantaneously (as opposed to reading the file from the beginning again and counting the number of records (lines) required to reach the record (line)?

Surely this information (position within the file) is known - at the very least at the IO runtime procedure level. But can it be used, and if so, how?

0 Kudos
7 Replies
Ron_Green
Moderator
635 Views

The answer is no, for sequential files.

A file handle object is owned by a process, in the proc struct.  Once the process exits that struct is freed by the OS.

 

For this usage, the ability to jump directly to a position in a file, is provided by DIRECT access files.  One way to save the last record position is to store it in the first record of the direct file before you close it.  The first record is this 'last record' number and the data starts at record 2.

0 Kudos
OP1
New Contributor II
632 Views

Thanks Ron -

 

Actually, I just started looking into the behavior of files opened with ACCESS = 'STREAM', FORM = 'FORMATTED' . This seems to answer my question since the INQUIRE(..., POS = ) and READ(..., POS = ) statements are possible.

Now I am trying to figure out what are the other behavior differences between STREAM + FORMATTED and SEQUENTIAL + FORMATTED. It seems this was introduced fairly recently. So far I do not see anything obvious, which makes me wonder if simply replacing SEQUENTIAL by STREAM in my OPEN statements would be without adverse consequences.

0 Kudos
OP1
New Contributor II
621 Views

The following code produces the exact same file using SEQUENTIAL + FORMATTED and STREAM + FORMATTED options (a byte-by-byte comparison shows they are indeed identical, including the end of lines marked with the 0d 0a CRLF bytes).

PROGRAM MAIN
IMPLICIT NONE (TYPE, EXTERNAL)
INTEGER :: UNIT

! Write a SEQUENTIAL, FORMATTED file.
OPEN(NEWUNIT = UNIT, FILE = 'test.sequential.formatted.txt', STATUS = 'REPLACE', ACCESS = 'SEQUENTIAL', FORM = 'FORMATTED')
WRITE(UNIT, FMT = '(A)') 'Hi!'
WRITE(UNIT, FMT = '(A)') 'This is a test'
WRITE(UNIT, FMT = '(I0)') 1245
WRITE(UNIT, FMT = '(G0)') 9842313.0D0
WRITE(UNIT, FMT = '(A)') 'One more line'
CLOSE(UNIT)

! Write a STREAM, FORMATTED file.
OPEN(NEWUNIT = UNIT, FILE = 'test.stream.formatted.txt', STATUS = 'REPLACE', ACCESS = 'STREAM', FORM = 'FORMATTED')
WRITE(UNIT, FMT = '(A)') 'Hi!'
WRITE(UNIT, FMT = '(A)') 'This is a test'
WRITE(UNIT, FMT = '(I0)') 1245
WRITE(UNIT, FMT = '(G0)') 9842313.0D0
WRITE(UNIT, FMT = '(A)') 'One more line'
CLOSE(UNIT)

END PROGRAM MAIN

I am confused by this paragraph in the Intel documentation (found in "Forms for Stream WRITE Statements"), which indicates that:

 


You can impose a record structure on a formatted, sequential stream by using a new-line character as a record terminator (see intrinsic function NEW_LINE).

In the example above this was clearly not necessary. What am I missing here?

0 Kudos
IanH
Honored Contributor II
587 Views

You can create records in a formatted stream file using advancing output statements, as you have done in your example, and you can create records by writing NEW_LINE characters to the file, as per the quoted documentation. One pathway doesn't exclude the other, "can" does not mean "it is necessary to".

The behaviour of writing a NEW_LINE to a formatted stream file is specified by the standard, there is no such specification for formatted sequential files - behaviour is processor dependent.

0 Kudos
jimdempseyatthecove
Honored Contributor III
575 Views

If performance is of concern...

You can use UNFORMATTED STREAM and read into a large character variable

character(len=1000000) :: buffer

Then parse for your new_lines.

Note, just after OPEN, do an inquire to get the file size in bytes.

so your first read may have to be into buffer(1:min(remainingFileSize,sizeof(buffer))

then after parsing "records", you will likely end up with a partial record/line at the end of the buffer.

this you then copy to the front of the buffer noting size of the leftover, and your next reads are into

buffer(leftover:min(remainingFileSize-leftover,sizeof(buffer)-leftover)

Please take care that the file starts at POS=1.

Jim Dempsey

 

 

0 Kudos
OP1
New Contributor II
569 Views

Thanks IanH and Jim - I started experimenting with the combination of STREAM and FORMATTED, and so far I don't see any downsides, which is great (since the POS I/O specifier can then be used in READ statements). This must be a fairly recent feature - or I completely missed it in the past.

The suggestion to read the whole file as a string is of course an option, but a last resort for now.

Thanks again!

0 Kudos
Steve_Lionel
Honored Contributor III
566 Views

It's a Fortran 2003 feature - ifort has supported it for many years.

0 Kudos
Reply