Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Cross Platform Reading of Unformatted Files

mohanmuthu
New Contributor I
1,205 Views

Experts!

I got a strange problem in cross platform reading the unformatted file - created in linux and read in windows or vice versa. The file gets read properly in the same platform. I simplifed the problem to following code - one to write the binary file and another to read it. Strangely, the string is read, but not the REAL*4.

I found a discussion https://software.intel.com/en-us/forums/intel-fortran-compiler/topic/807408 sounding somewhat related, but I could not understand any solution from it.

Any suggestion to fix it will be of great help!

PROGRAM writebin
    IMPLICIT NONE
    CHARACTER*16 :: str1
    REAL*4       :: r1
    str1 = 'Hello!'
    r1 = 10.1
    OPEN(11,FILE='bin.bin',FORM='UNFORMATTED',ACTION='WRITE')
    WRITE(11) str1
    WRITE(11) r1
    CLOSE(11)
END PROGRAM

PROGRAM readbin
    IMPLICIT NONE
    CHARACTER*16 :: str1
    REAL*4       :: r1
    OPEN(11,FILE='bin.bin',FORM='UNFORMATTED',ACTION='READ')
    READ(11) str1
    WRITE(*,*) str1
    READ(11) r1
    WRITE(*,*) r1
    CLOSE(11)
END PROGRAM

PS: Used Intel(R) Compiler 17.0 Update 4 (package 210) on Windows 10 x64 and Version 19.0.1.144 Build 20181018 on Linux 3.10.0 x64

0 Kudos
1 Solution
mecej4
Honored Contributor III
1,205 Views

You have specified on the Linux host, either as a compiler option or in a compiler default options file such as "ifort.cfg", or using an environment variable (F_UFMTENDIAN) an option to open the file for writing as a Big-Endian unformatted file. See the 00 00 00 10 in the Linux file at the beginning? That record length signifies to the Windows EXE that the record length is Z'10000000' instead of Z'00000010' . No wonder the file could not be read on your Windows PC, since there are not that many bytes in the entire file.

Similarly, the length of the second record, which is just 4 bytes for a single precision real, is being read as Z'04000000' instead of Z'00000004'

Check your Ifort installation on the Linux machine. Do you have any need for Big-Endian data format at all? If not, make all changes necessary to remove the big-endian option once and for ever.

If your company policy does not allow changes to be made to the Linux Ifort configuration, you will need to tell the PC program that the file is to be treated as Big-Endian. You can either do this for all unformatted files processed by your program, as described at https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-convert , or you can add the appropriate clause in the OPEN statement for the specific file concerned in your program. 

In Windows 10, I typed into a hex editor the bytes that you gave as the content of the bin.bin file on Linux, compiled rdbin.f90 with /convert:big_endian, and the resulting EXE read the Linux file correctly.

View solution in original post

0 Kudos
9 Replies
mecej4
Honored Contributor III
1,205 Views

Please describe how you move or copy the file BIN.BIN from one platform to the other. Name the compiler used on each platform and also the compiler options. Unformatted files are not necessarily compatible when you use different compilers on the same platform. They are not necessarily compatible between different versions of a single vendor's compiler on a given platform. There may also be differences in "endianness", and in the internal representations of real numbers.

What do you mean by "Strangely, the string is read, but not the REAL*4"? How did you determine that the second value was not read?

0 Kudos
mohanmuthu
New Contributor I
1,205 Views

I used Intel(R) Compiler 17.0 Update 4 (package 210) on Windows 10 x64 and Intel(R) Compiler Version 19.0.1.144 Build 20181018 on Linux 3.10.0 x64. Also tested 17.0.4.196 Build 20170411 on Linux, but same issue. No options specified explicitly while compiling > ifort writebin.f90 - o writebin.exe (same in Linux without .exe)

Am transferring files using SFTP protocol [using WinSCP software].

0 Kudos
mecej4
Honored Contributor III
1,205 Views

Please attach the BIN.BIN file that you generated on the Linux system to your reply. Make sure that you use binary mode and not text (or ascii) mode when using SFTP.

0 Kudos
Steve_Lionel
Honored Contributor III
1,205 Views

Intel Fortran uses an identical on-disk layout for unformatted files across its platforms. I side with mecej4 who suspects data corruption during the transfer.

0 Kudos
mecej4
Honored Contributor III
1,205 Views

Here is something that you can do to check if the file bin.bin gets modified in the course of being transferred from one system to another. Use a binary file editor such as HXD or dump the contents of the file. On the Linux system, compile, run and dump the file contents.

~/LANG$ ifort -O0 wrbin.f90
~/LANG$ ./a.out
~/LANG$ od -t x1 bin.bin
0000000 10 00 00 00 48 65 6c 6c 6f 21 20 20 20 20 20 20
0000020 20 20 20 20 10 00 00 00 04 00 00 00 9a 99 21 41
0000040 04 00 00 00

Now, compile and run on Windows.

I happen to have Cygwin on my Windows 10 PC, so I can do the dump on the PC, as well, using the same od command. I find that the files are identical. No file transfer was performed.

0 Kudos
mohanmuthu
New Contributor I
1,205 Views

I can't attach any file due to the company's security policy. However, I could get the dump of file contents. This time I used version 17 on both Windows and Linux.

bin.bin created in Windows [stay's the same even after SFTP transfer to Linux]:

0000000 10 00 00 00 48 65 6c 6c 6f 21 20 20 20 20 20 20
0000020 20 20 20 20 10 00 00 00 04 00 00 00 9a 99 21 41
0000040 04 00 00 00
0000044

bin.bin created in Linux:

0000000 00 00 00 10 48 65 6c 6c 6f 21 20 20 20 20 20 20
0000020 20 20 20 20 00 00 00 10 00 00 00 04 41 21 99 9a
0000040 00 00 00 04
0000044

Are there any options in OPEN function to get the formats same?

0 Kudos
mecej4
Honored Contributor III
1,206 Views

You have specified on the Linux host, either as a compiler option or in a compiler default options file such as "ifort.cfg", or using an environment variable (F_UFMTENDIAN) an option to open the file for writing as a Big-Endian unformatted file. See the 00 00 00 10 in the Linux file at the beginning? That record length signifies to the Windows EXE that the record length is Z'10000000' instead of Z'00000010' . No wonder the file could not be read on your Windows PC, since there are not that many bytes in the entire file.

Similarly, the length of the second record, which is just 4 bytes for a single precision real, is being read as Z'04000000' instead of Z'00000004'

Check your Ifort installation on the Linux machine. Do you have any need for Big-Endian data format at all? If not, make all changes necessary to remove the big-endian option once and for ever.

If your company policy does not allow changes to be made to the Linux Ifort configuration, you will need to tell the PC program that the file is to be treated as Big-Endian. You can either do this for all unformatted files processed by your program, as described at https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-convert , or you can add the appropriate clause in the OPEN statement for the specific file concerned in your program. 

In Windows 10, I typed into a hex editor the bytes that you gave as the content of the bin.bin file on Linux, compiled rdbin.f90 with /convert:big_endian, and the resulting EXE read the Linux file correctly.

0 Kudos
mohanmuthu
New Contributor I
1,207 Views

Thanks a lot mecej4 for helping me understand the cause.

I did check with same convert option both both platforms and it works. Though its easy to convert to big_endian for future, is there a way to bring Linux option to equivalent of native in Windows? That would help maintaining backward compatibility with all windows generated files.

0 Kudos
mecej4
Honored Contributor III
1,207 Views

As you can see in the documentation for /convert (link given in #8), you can specify the option -convert little_endian or -convert native on the Linux machine, but that is the default.

I urge you to look at the source files and build commands on the Linux machine, track down why big-endian unformatted files are being generated, and fix the problem at the source. By default, unformatted source files generated by Intel Fortran on Linux are little-endian, since Intel Fortran does not run on big-endian Linux (such as Linux on OpenPower or IBM/Motorola hardware). Therefore, someone must have taken deliberate action to make the files become big-endian. Find out and undo that action.

0 Kudos
Reply