Solved: Problems writing/reading unformatted files with Intel fortran and gfortran

Polk__Jay · ‎04-12-2020

Hi, I am trying to use a code on both Windows with an Intel compiler and on Mac OS with gfortran 6.5.0 and can't get the gfortran version to read parts of an unformatted file. I wrote a test code that has the problematic section and verified it works in writing and reading the file on the Intel compiler and on the gfortran compiler. However, if I comment out the write portion and just try to read the file written on the Windows side of the machine (Parallels) with the gfortran version, some of the data is not read properly. I'm including the test code below and attaching the file. The "cel(1), f_ei,f_en,f_egyro,f_wall,f_ex,f_wallBC =" data are not read correctly (the data before that are read in properly, however). Is there some incompatibility in unformatted files?

PROGRAM Testo
IMPLICIT NONE
INTEGER, PARAMETER :: ncelMAX=6800
REAL*8 :: dt
LOGICAL :: MagON,BFmesh
INTEGER :: nslice_Ver, nslice_verzr, nslice_Edg, nslice_edgzr
INTEGER :: nFluid, nChrge, nCel, nBCs, nEdg, nVer , nCelzr, nBCszr, nEdgzr, &
       nVerzr, MinSliceCel, MaxSliceCel, MinSliceCelzr, MaxSliceCelzr
TYPE cell_obj
     INTEGER*4 verNo(4),edgNo(4)
     INTEGER*4 Bln   !BField Mesh
     REAL*8 f_ei,f_en,f_egyro,f_wall,f_ex, f_wallBC
END TYPE cell_obj
TYPE(cell_obj) cel(ncelMAX)

dt = 2.0E-8
MagON = .TRUE.
BFmesh = .TRUE.
nslice_Ver = 1
nslice_verzr = 2
nslice_Edg = 3
nslice_edgzr = 4
nFluid = 3
nChrge = 3
nCel = 3000
nBCs = 4000
nEdg = 5000
nVer = 6000
nCelzr = 7000
nBCszr = 8000
nEdgzr = 9000
nVerzr = 10000
MinSliceCel = 100
MaxSliceCel = 200
MinSliceCelzr = 300
MaxSliceCelzr = 400
cel(1)%verNo(1) = 1
cel(1)%verNo(2) = 2
cel(1)%verNo(3) = 3
cel(1)%verNo(4) = 4
cel(1)%edgNo(1) = 1
cel(1)%edgNo(2) = 2
cel(1)%edgNo(3) = 3
cel(1)%edgNo(4) = 4
cel(1)%Bln = 1000
cel(1)%f_ei = 2.0E6
cel(1)%f_en = 2.0E7
cel(1)%f_egyro = 2.0E8
cel(1)%f_wall = 3.0E6
cel(1)%f_ex = 3.0E7
cel(1)%f_wallBC = 3.0E8

OPEN(UNIT=10,FILE='restartData_TEST',FORM='UNFORMATTED')
REWIND(10)
WRITE(10) dt, MagON, nslice_Ver, nslice_Edg, nslice_verzr, nslice_edgzr, BFmesh
WRITE(10) nFluid, nChrge, nCel, nBCs, nEdg, nVer , nCelzr, nBCszr, nEdgzr, nVerzr, MinSliceCel, MaxSliceCel,MinSliceCelzr,MaxSliceCelzr
WRITE(10) cel
CLOSE(10)

OPEN(UNIT=10,FILE='restartData_TEST',FORM='UNFORMATTED')

REWIND(10)
READ(10) dt,MagON,nslice_Ver,nslice_Edg, nslice_verzr, nslice_edgzr, BFmesh
PRINT *, "First line = ", dt,MagON,nslice_Ver,nslice_Edg, nslice_verzr, nslice_edgzr, BFmesh
READ(10) nFluid, nChrge, nCel, nBCs, nEdg, nVer , nCelzr, nBCszr, nEdgzr, nVerzr, MinSliceCel, MaxSliceCel,MinSliceCelzr,MaxSliceCelzr
PRINT *, "Second line = ", nFluid, nChrge, nCel, nBCs, nEdg, nVer , nCelzr, nBCszr, nEdgzr, nVerzr, MinSliceCel, MaxSliceCel,MinSliceCelzr,MaxSliceCelzr
READ(10) cel
PRINT *, "cel(1), verNo(4),edgNo(4) = ",cel(1)%verNo(1),cel(1)%verNo(2),cel(1)%verNo(3),cel(1)%verNo(4),cel(1)%edgNo(1),cel(1)%edgNo(2),cel(1)%edgNo(3),cel(1)%edgNo(4)
PRINT *, "cel(1), Bln = ",cel(1)%Bln
PRINT *, "cel(1), f_ei,f_en,f_egyro,f_wall,f_ex,f_wallBC = ",cel(1)%f_ei,cel(1)%f_en,cel(1)%f_egyro,cel(1)%f_wall,cel(1)%f_ex,cel(1)%f_wallBC

PAUSE 'DONE'
END PROGRAM Testo

John_Campbell · ‎05-04-2020

There is no guarantee both programs will have the same storage order for writing cell.
You could try replacing "WRITE(10) cel" by :

WRITE(10) cel%verNo
WRITE(10) cel%edgNo
WRITE(10) cel%Bln
WRITE(10) cel%f_ei
WRITE(10) cel%f_en
WRITE(10) cel%f_egyro
WRITE(10) cel%f_wall
WRITE(10) cel%f_ex
WRITE(10) cel%f_wallBC

A simpler change that may address possible storage order issues between gFortran and iFort could be:
TYPE cell_obj
SEQUENCE
REAL*8 f_ei,f_en,f_egyro,f_wall,f_ex, f_wallBC
INTEGER*4 verNo(4),edgNo(4)
INTEGER*4 Bln, BField ! Mesh
END TYPE cell_obj

This changes each type record to a multiple of 8 bytes, but I'd expect the first change to work.

View solution in original post

John_Campbell · ‎04-12-2020

Unformatted binary files use a record size header and trailer, which can be different between different compilers or versions.

You may want to review the gFortran option "-frecord-marker=length". Changing this value to =4 may help, although this would still require a similar convention for the header and trailer formats between gFortran and ifort.

Alternatively, unformatted direct access binary files with fixed length records do not have a header or trailer and are easier to share between Fortran programs compiled with different compilers. These have limited flexibility if records are not simple array dumps of the same size. I am familiar with this being a viable solution.

However, depending on the operating system that is being used, there may still be an "endian" byte order issue for binary numbers that are stored. This can also be addressed via the gFortran option "-fconvert=conversion", or resort to an integer*1 byte(4 or 8) reordering solution.

I am sure ifort provides similar options if required.

Polk__Jay · ‎04-13-2020

Hi John,

thanks for the quick response. I tried "-frecord-marker=4" and "-fconvert=little-endian" and neither fixed the problem. By trial and error I was able to get it to work if the variables in the structure under the TYPE declaration are grouped by INTEGER, REAL, and LOGICAL (which I added). The order doesn't seem to matter, but if I mix them, like the original code, it does not work. Could that be an extra constraint from gfortran?

mecej4 · ‎04-13-2020

You have some source lines that are over 132 characters long. Those lines happen to contain the I/O lists of unformatted file I/O statements, and a Fortran compiler may truncate those lines to 132-character lines. Use compiler options that allow source lines longer than 132 characters to be compiled, or edit the source file and break up the long lines.

Unformatted files on Windows, Linux and OSX, produced by programs compiled with Intel Fortran and Gfortran, are compatible with one or two exceptions: RECL= clauses in file OPEN statements, which your example does not use. The binary representations of .TRUE. and .FALSE. may differ from one Fortran compiler and another. I/O list values obtained from floating point calculations are subject to precision loss.

Steve_Lionel · ‎04-13-2020

As mecej4 says, ifort and gfortran use the same on-disk layout for unformatted files, even those with records longer than 1GB. You are using a quite old gfortran version, but it should still work.

I don't see that you attached the data file in question, nor did you provide details about what didn't "read properly" and how it was wrong.

Polk__Jay · ‎05-03-2020

I'm attaching two data files--one stored with the code on Windows with ifortran (VS2010) and the other stored on the Mac with gfortran. The code on the Mac reads the Mac file properly, but when it reads the Windows file the reals f_ei,f_en,f_egyro,f_wall,f_ex, f_wallBC are not correct (all read as 0 or xxxe-315). I'm in the process of trying to convert the file storage to HDF5, but it's a lot of work. If there's an easier solution, that would be great!

Steve_Lionel · ‎05-03-2020

The layouts of these files are the same. One difference I can see relates to the LOGICAL values, as the default representations of true are different, but this should have no practical effect.

These files were not written by the code you show in the first post. Please show the code you used on both systems.

The second record is completely wrong in both.

Polk__Jay · ‎05-03-2020

The code is essentially the same--looks like I changed the order of the some items that are written and read. I only assign values to the first element of the cel array, so that may be why the second record looks wrong. I'm attaching both codes. Thanks!

John_Campbell · ‎05-04-2020

There is no guarantee both programs will have the same storage order for writing cell.
You could try replacing "WRITE(10) cel" by :

WRITE(10) cel%verNo
WRITE(10) cel%edgNo
WRITE(10) cel%Bln
WRITE(10) cel%f_ei
WRITE(10) cel%f_en
WRITE(10) cel%f_egyro
WRITE(10) cel%f_wall
WRITE(10) cel%f_ex
WRITE(10) cel%f_wallBC

A simpler change that may address possible storage order issues between gFortran and iFort could be:
TYPE cell_obj
SEQUENCE
REAL*8 f_ei,f_en,f_egyro,f_wall,f_ex, f_wallBC
INTEGER*4 verNo(4),edgNo(4)
INTEGER*4 Bln, BField ! Mesh
END TYPE cell_obj

This changes each type record to a multiple of 8 bytes, but I'd expect the first change to work.

Steve_Lionel · ‎05-04-2020

You sure did rearrange the writes - no wonder I was confused.

As John suggests, the two compilers differ in how or whether they pad misaligned components in a derived type. In addition to reordering the components so that all are naturally aligned, you could probably get it to work by giving the type the BIND(C) attribute. This will make both compilers order and pad the same way their "companion C processor" would, and I am pretty sure gcc and MSVC do this the same way. But reordering would be my preference, as that eliminates a possible variable.

Polk__Jay · ‎05-04-2020

Thank you both for the quick responses! Both approaches (writing individual components or adding SEQUENCE) appear to work in my test code. I will add the SEQUENCE keyword to the structures in the big code and see if that solves the problem. If so, then you have spared me a lot of work rewriting the code to store in HDF5 format!

Steve_Lionel · ‎05-04-2020

SEQUENCE is not guaranteed to eliminate the problem, but if it works for you, fine. Just be aware that it isn't a perfect solution.

Polk__Jay · ‎05-04-2020

Yes, I think i have to go with reading each component of the derived type anyway. Even if SEQUENCE worked, I couldn't read older restart files.