Unformatted DTIO failure with >2GiB records

jbenda · ‎07-25-2025

I encountered issues when using derived-type I/O (DTIO) with unformatted records larger than ~2 GiB. The program attached below reproduces the problem. It uses DTIO to write a 4 GiB record consisting of an array of 1024³ default integers. Then, reading of the same record is attempted using the standard unformatted I/O. The reading fails ("input statement requires too much data"), presumably because the leading binary bookmark of the record is wrong (overflown) and the dataset is not split into the supported ~2-GiB chunks. I observe this behavior with IFX 2025.2. When the code is compiled with GNU Fortran 15.1, it works well - the record is properly split into smaller internal chunks that are subsequently seamlessly read using the followup READ statement.

module procs

    implicit none

    type mytype
        integer :: n
    contains
        procedure :: writer
        generic :: write(unformatted) => writer
    end type mytype

contains

    subroutine writer(this, lu, io_stat, io_message)

        class(mytype), intent(in)    :: this
        integer,       intent(in)    :: lu
        integer,       intent(out)   :: io_stat
        character(*),  intent(inout) :: io_message

        integer, allocatable :: huge_data(:)
        integer :: i

        allocate (huge_data(this%n))

        do i = 1, this%n
            huge_data(i) = i
        end do

        write (lu, iostat=io_stat, iomsg=io_message) huge_data

    end subroutine writer

    
    subroutine program_body

        integer :: fd, n, i
        integer, allocatable :: huge_data(:)

        type(mytype) :: myobj

        n = 1024**3
        myobj%n = n
        allocate (huge_data(n))

        open (newunit = fd, file = 'test-dtio.bin', action = 'write', form = 'unformatted')
        write (fd) myobj
        close (fd)

        open (newunit = fd, file = 'test-dtio.bin', action = 'read', form = 'unformatted')
        read (fd) huge_data  !<-- FAILS with ifort/ifx
        close (fd)

        do i = 1, n
            if (huge_data(i) /= i) then
                print '(2(a,i0))', 'Mismatch: huge_data(', i, ') = ', huge_data(i)
                error stop
            end if
        end do

    end subroutine program_body

end module procs


program test_dtio

    use procs, only: program_body

    call program_body

end program test_dtio

Ron_Green · ‎07-25-2025

yes, looks like a 32bit pointer being used for buffer in dtio code path. I'll open a bug report.

garraleta_fortran · ‎07-25-2025

1-The subroutine WRITE is not called
2-The variable FD is undefined (DEFAULT=0)
3-The file TEST-DTIO.BIN contains only one number 1024**3
This is more simple (No MODULES, No CONTAINS....)

INTEGER :: FD, N, I
INTEGER, ALLOCATABLE :: HUGE_DATA(:)

N = 1024**3
ALLOCATE (HUGE_DATA(N))
FD=1
DO I = 1,N
HUGE_DATA(I) = I
END DO

OPEN (NEWUNIT = FD, FILE = 'TEST-DTIO.BIN', ACTION = 'WRITE', FORM = 'UNFORMATTED')
WRITE (FD) HUGE_DATA
CLOSE (FD)

OPEN (NEWUNIT = FD, FILE = 'TEST-DTIO.BIN', ACTION = 'READ', FORM = 'UNFORMATTED')
READ (FD) HUGE_DATA !<-- FAILS WITH IFORT/IFX
CLOSE (FD)

DO I = 1, N
IF (HUGE_DATA(I) /= I) THEN
PRINT '(2(A,I0))', 'MISMATCH: HUGE_DATA(', I, ') = ', HUGE_DATA(I)
ERROR STOP
END IF
END DO
WRITE(*,FMT='(A)')'OK'

END

jbenda · ‎07-25-2025

Thank you for your follow-up. However, I am not sure this reproduces the problem that I reported. The record written by the standard unformatted output in your code works well. As far as I understand, it is the unformated used-defined derived-type I/O that produces wrong records, not any plain unformatted wrriting of large data. When I run your code with IFX 2025.2.0, I get "OK". With this attached Python script

import struct
import sys

f = open(sys.argv[1], 'rb')
begin = -1

while begin < 0:

    begin, = struct.unpack('i', f.read(4))
    f.read(abs(begin))
    end, = struct.unpack('i', f.read(4))

    print(f'Subrecord begin {begin} end {end}')

    if (abs(begin) != abs(end)):
        raise Exception(f'Bookmark mismatch: {begin} vs {end}')

one can also verify that the binary structure of the file is correct:

$ python3 read.py TEST-DTIO.BIN 
Subrecord begin -2147483639 end 2147483639
Subrecord begin -2147483639 end -2147483639
Subrecord begin 18 end -18

garraleta_fortran · ‎07-26-2025

Using your original code with 2 corrections work fine

Ron_Green · ‎07-28-2025

@garraleta_fortran thank you for your modified code. But what @jbenda is reporting is using a completely different code path in the Fortran Runtime library. DTIO uses it's own paths for IO. It is this DTIO code path that has a bug, probably we are using 32bit integer offsets instead of the needed 64bit offsets in the DTIO code path. We'll get it fixed.
The non-DTIO path is well exercised and behaves as expected.

Ron_Green · ‎07-29-2025

bug ID is CMPLRLIBS-35389