- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a most perplexing problem with some code that we just released. We have a file that was written using the following code:
K = 9
OPEN ( UNIT=98, FILE='my.data', FORM='UNFORMATTED', STATUS='NEW')
WRITE ( 98 ) ( ( AN_ARRAY(I,J), I=0,K ), J=-1,15 )
CLOSE (UNIT=98, STATUS='KEEP')
...and read using the following code:
K = 9
OPEN ( UNIT=119, FILE='my.data', FORM='UNFORMATTED', STATUS='OLD', READONLY )
READ ( 119 ) ( ( AN_ARRAY(I,J), I=0,K ), J=-1,15 )
Relatively simple.....worked for us for years. However, one of our customers reported that they get a read error on their system when the read operation executes:
forrtl: severe (22): input record too long, unit 119, file /some_path/my.data
The confusing part is this same program and same binary file (residing on an NFS mount) executes fine on another system in the same lab, as well as our machines here at the development lab.
Some configuration details:
Machine that fails:
- RHEL 3, kernel 2.6.14.5, libc 2.3.2, gcc 3.2.3 20030502, Intel P4 (family 15, model 2, stepping 9) 2.80 GHz, ifort 8.1.029 20050702
Machine that works:
- RHEL 4, 2.6.9-42.EL, 2.3.4, gcc 3.4.6 20060404, Intel P4 (family 15, model 2, stepping 9) 3.00 GHz
It should be noted that this same code has previously worked on RHEL 3 -- I'm still trying to track down what the difference is. The code is compiled on the machine that fails -- the machine that works does not have the compiler installed.
Compile flags: -132 -nbs -align dcommons -static-libcxa -nus -zero -save -xN -axN -fp_port -c -O0 -prec_div -no_cpprt -fpstkchk -ccdefault fortran -fpe0 -convert native
Link flags: -static-libcxa -Wl,-d -Wl,--sort-common
We also tried compiling with the "-convert native" flag removed, with no effect.
Thanks in advance,
Eric
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Two things to try. First, look at environment variables to make sure that the user has not set any of those that change Fortran unformatted data conversion. Second, ask the user to do a "od -t x4" of the file and compare it to what you see with the same file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was wondering if the file was created and read on the same computer, or if the file had been moved between systems?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And while it could potentially be an NFS problem, one of the first things I did was an "md5sum my.data" and compared the hashes (and found them to be identical) on both systems. Even when the file is copied to a local path, it still doesn't like it for whatever reason. I'm ready to resign myself to the fact that it's something system-related (that I unfortunately can't do anything about) rather than code or compiler-related, but I figured I'd check in here first.
Another note, as I'm seeing another reply was posted -- The file is created on our development machines here, and then distributed as a binary data file to the client. The md5sum of the file here and both machines there match.
Thanks again,
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tested on a 'close' EL3, kernel 2.4.21-37 with same gcc as user - I believe this is RHEL3U6.
I didn't have the exact 8.1, so I used a older one and a newer one: 8.1.022 and 8.1.036. I can't reproduce the error, the code frag runs just fine, as expected, with your options save the -xN -axN which my older compiler didn't like.
So like you, I am suspicious of the RHEL3. I don't have any older RHEL3 systems of that era to test against.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One parting shot - I was stewing over this issue last night. We are sure the file is the same and uncorrupted. The error the user is seeing indicates that the read is attempting to fetch too much data. I think you mentioned that you release source code and allow the users to compile on their target platform. Well if it isn't the data file, it must be the source code OR the system. Of the 2, I suspect the source code. Ask the user to md5sum his sources.
As someone who works support, we see time and time again customers swearing that the code ran on X but doesn't run on Y. And they swear it's "the same code". After much probing we often find the customer changed something small in the code, claiming "it should not affect the results". So I'd be politely suspicious of your user. Might put a few print statements before the read and have him execute that code.
good hunting
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Apparently another case of the left hand not knowing what the right hand is doing.
Thanks again.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page