Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Direct access file

Le_Callet__Morgan_M
416 Views

Dear all, I would have the following question:

1. Under certain circumstances i need to read records who happen to be consecutive: is there any benefit in term of speed to code so that the read is done sequentially ?

2. If a file has been writtenwith a record length,  is it safe to try to reopen and ready it with a record length being a multiple of the original record length ?

3. Can i write user define type (structure) to a file and how do i know in which order data are written: say i write 3 records with each an int a a real array of fixed size. Can i create a structure made of one one int and a real array of fixed size and read the file using my structure 3 times or open it using a multiple of of my structure size and do one read only ?

Many thanks.

0 Kudos
3 Replies
IanH
Honored Contributor II
416 Views

1. Data for nearby records is more likely to be in one of the various levels of cache for the file than data for remote records, so I would expect a small performance gain for reads of consecutive records.

2. It depends.  In terms of the standard, if the record length is different from that used to initially create the file, then it may not be in the processor dependent permitted set of allowed record lengths for the file (F2008 9.5.6.15).  For typical implementation of direct access files there won't be a problem - typically there's no record length information or other record structure information recorded in the file.  Exceptions exist though - ifort's behaviour under various vms compatibility options for example.

The semantics of unformatted stream access may be of interest to you.

3. You can write an object of user defined type to a file.  If the derived type does not have any allocatable or pointer subobjects, you can do so without there being an accessible defined input/output procedure. 

For unformatted input/output of a derived type in the absence of defined input/output for the object or one of its components, the order in which things are written to the file is not specified.  In this case an object of derived type is treated as a single processor dependent blob.  This permits the processor to keep the layout in the file the same as the layout in memory - which may not be in component order and/or might including padding bytes to maintain memory alignment for components.  (So, to be clear, in this case, the in-file representation may not be consistent with the in-file representation if you manually wrote out the components in component order.  Or it may - it is processor dependent.)

If there is a relevant defined input/input procedure accessible for the object of derived type, then the input/output is controlled by that procedure.  If there is a relevant defined input/output procedure accessible for a subobject of an object derived type, then the object is broken down into its constituent components for input/output.

Elements of an array of can be written or read as individual scalars, or some combination of arrays and scalars, so long as the total number of objects is consistent between writes and reads.

 

0 Kudos
Le_Callet__Morgan_M
416 Views

I have seen some marginal improvements many thanks by reading in order: does buffered improve the reading speed ?

I basically trying to read blob of data in array to speed up some porcessing.

I have not looked at stream yet but for standard unformatted file io i am not sure i understand do you expect the following to work:

    module readstructure
    
    type :: testData
        integer(4) :: m_i
        real(8) :: m_r(4)
    end type testData
    
    type :: justnumber
        real(8) :: m_d(5)
    end type justnumber
    
    integer(4) :: funit
    contains
    
    subroutine test()
    ! Variables
    integer(4) :: io,i,j
    type(testData) :: dataread
    real(8) :: r1,r2,r3,r4
    ! Body of Openfile
    open(newunit=funit, file='test%data',status='replace',access='direct', form='binary',recl = 4 + 4*8,iostat=io)
    if(io == 0) then
        do i=1 , 10
            write(funit,rec=i,err = 1,iostat = io) i,2.0d0*i,3.0d0*i,4.0d0*i,5.0d0*i
        end do
1       continue
        close(funit)
        if(io == 0) then
            open(newunit=funit, file='test%data',status='old',access='direct', recl = 5*4,iostat=io)
            if(io == 0) then
                do i = 1 , 10
                    read(funit,rec=i,err=2,iostat=io) dataread
                    !read(funit,rec=i,err=2,iostat=io) dataread_m_i,dataread%m_r
                    !read(funit,rec = i,err=2,iostat=io) j,r1,r2,r3,r4
                    !test
                    if(.not. TestReadData(i,dataread)) then
                        print * , 'error'
                        exit
                    endif
                enddo
2               continue
                if(io /= 0) then
                    print * , 'error reading file'
                endif
                close(funit)
            endif
        endif
        
        
    endif
    end subroutine test

    function TestReadData(i,datatested) result(r)
    logical :: r
    ! Variables
    integer(4),intent(in) :: i 
    type(testData),intent(in)::datatested
    ! Body of TestReadData
    r = .false.
    if(i == datatested%m_i) then
        if(datatested%m_r(1) == 2.0d0*i) then
            if(datatested%m_r(2) == 3.0d0*i) then
                if(datatested%m_r(3) == 4.0d0*i) then
                    if(datatested%m_r(4) == 5.0d0*i) then
                        r = .true.
                    endif
                endif
            endif
        endif
    endif
        
    end function TestReadData
    
    
    

    
    
    
    
    end module readstructure

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
416 Views

>>open(newunit=funit, file='test%data',status='replace',access='direct', form='binary',recl = 4 + 4*8,iostat=io)

OPEN: RECL Specifier

The RECL specifier indicates the length of each record in a file connected for direct access, or the maximum length of a record in a file connected for sequential access.

The RECL specifier takes the following form:

RECL = rl

rl

Is a positive numeric expression indicating the length of records in the file. If necessary, the value is converted to integer data type before use.

If the file is connected for formatted data transfer, the value must be expressed in bytes (characters). Otherwise, the value is expressed in 4-byte units (longwords). If the file is connected for unformatted data transfer, the value can be expressed in bytes if compiler option assume byterecl is specified.

Jim Dempsey

0 Kudos
Reply