Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

asychronous bug #1: unit numbers

Izaak_Beekman
New Contributor II
812 Views
I have finally written some reproducer code to illustrate a bug I am seeing. Please find the code below:

[cpp]PROGRAM unit_number
  IMPLICIT NONE
  INTEGER, PARAMETER :: unit_num = 124
  OPEN(unit_num, file='kplane', asynchronous='yes')
  WRITE(*,*) 'File is open.'
  CLOSE(unit_num)
END PROGRAM unit_number
[/cpp]

On my system for unit numbers 124 and greater I cannot open files for asynchronous reading or writing. The file can be formatted, unformatted, opened with various other parameters (i.e. position=,status=, etc.) with the same result: the code hangs at the open statement. Below is a stack trace. Note that I have to pres C-c twice for some reason (might have to do with the fact that asynchronous IO can spawn an additional thread).

[shell]08:40 PM (1) ~ $ fg
./a.out
^Cforrtl: error (69): process interrupted (SIGINT)
Image              PC        Routine            Line        Source             
a.out              0806A4A4  Unknown               Unknown  Unknown
a.out              08050A0F  Unknown               Unknown  Unknown
a.out              08050D70  Unknown               Unknown  Unknown
a.out              080525BF  Unknown               Unknown  Unknown
a.out              08049D43  MAIN__                      4  unit_number.f90
a.out              08049CB1  Unknown               Unknown  Unknown
libc.so.6          4007B775  Unknown               Unknown  Unknown
a.out              08049BC1  Unknown               Unknown  Unknown
^Cforrtl: error (69): process interrupted (SIGINT)
Image              PC        Routine            Line        Source             
a.out              08094A9D  Unknown               Unknown  Unknown
a.out              08094015  Unknown               Unknown  Unknown
a.out              08067088  Unknown               Unknown  Unknown
a.out              0804BF63  Unknown               Unknown  Unknown
a.out              0804DD79  Unknown               Unknown  Unknown

Stack trace terminated abnormally.
[/shell]


I have also attached a core dump.

My machine is a Lenovo thinkpad t60 2007-63u. I am currently running ubuntu jaunty (9.4), but had the same issue with intrepid (8.10). My current version of ifort is:

Intel Fortran Compiler Professional for applications running on IA-32, Version 11.0 Build 20090131 Package ID: l_cprof_p_11.0.081
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.
FOR NON-COMMERCIAL USE ONLY

I have not tested this with the latest 083 minor version release. Note this is my personal laptop, not a machine for conducting any sort of commercial activity.

[shell]08:26 PM (0) ~ $ uname -a
Linux ***********-laptop 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:57:59 UTC 2009 i686 GNU/Linux
[/shell]

I have also tested this on a second machine running:

Intel Fortran Compiler for applications running on IA-32, Version 10.1 Build 20080801 Package ID:
Copyright (C) 1985-2008 Intel Corporation. All rights reserved.

This machine is probably running RHEL or some derivative:

[shell][********@***** ~]$ uname -a
Linux ******* 2.6.18-128.1.6.el5 #1 SMP Wed Apr 1 11:25:32 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
[/shell]
Steve Lionel mentioned a unit number bug affecting 10.1 but I think this is different:
"There was an issue related to the implementation of asynchronous I/O that caused "collision" between units with a certain spacing between them, but that has been fixed for a while now.
....
Version 10.1 was affected. It was fixed in an update to 10.1 in July 2008. The issue affected two units whose numbers were separated by 521."

I may try to get one of the guys at work to file a bug report through premier.intel.com (or whatever the link is) but I cannot gaurantee this will happen anytime in the near future, so if one of the intel people could look at this and if they agree it is a bug do that, that might be best.

Also I beleive there is an issue with implied do loops in asynchronous read and possibly write statements and will be working on some reproducer code for this too in the near future.
0 Kudos
6 Replies
TimP
Honored Contributor III
812 Views
Look at what? the attached empty file?
0 Kudos
Izaak_Beekman
New Contributor II
812 Views
Quoting - tim18
Look at what? the attached empty file?

Yes thank you tim, there seems to have been an issue uploading the core dump. I have re-uploaded it and reattached it, I'm not sure if will work now or not. Either way the very simple program at the top of my original post hangs. Essentially you only need one line fo executable code to reproduce this issue, and I am seeing this problem on multiple machines. If you prefer downloading a text file of the source rather than copying it from the first code block of the original post please let me know and I would be happy to attach it. Here is the core dump, i cannot gauranty that it uploaded correctly this time:core.8005

EDIT: It seems that perhaps you cannot upload binary files here. The file size upon download is still 0 bytes. If any one wants this core dump let me know and I can email it to you, but hopefully it is superfluous: I think there is plenty of information to go in in the original post without the core dump.
0 Kudos
Ron_Green
Moderator
812 Views
I have reproduced this bug on both Ubuntu and a variety of RedHat systems. I will start a bug report.

Now if you want to track this bug, you probably should open a Premier issue. Wait to do this until I post the bug tracking number. Then open up the Premier issue and post a note along with the reproducer. In the note add something like "Please associate this bug with User Forum issue 65272 and CQ issue xxxxx' where I'll supply xxxx in a few minutes.

Thanks for sending this in. What is interesting is that unit numbers 124 and some above show this behavior. I am trying to isolate the range of bad unit numbers or find a pattern.

ron
0 Kudos
Ron_Green
Moderator
812 Views
bug report id is DPD200121180

For those of you who like to look for patterns, this one is obvious. Here are the unit numbers that will cause a hang:

124,125,126,127,128
133,134

252,253,254,255,256
261,262

380,381,382,383,384
389,390

508,509,510,511,512
517,518

636,637,638,639,640
645,646

etc - see the pattern?

I used a csh script to auto create testcases with unit number incrementing, halting as the pattern became obvious.

Thanks for sending this in. I will keep this thread posted on progress.

ron
0 Kudos
Izaak_Beekman
New Contributor II
812 Views
Any word on whether this was fixed in 11.1? I am not an admin and cannot track this issue using premier.
0 Kudos
Ron_Green
Moderator
812 Views
Quoting - zbeekman
Any word on whether this was fixed in 11.1? I am not an admin and cannot track this issue using premier.

There is a fix but it has not yet made it into the source tree. Thus, it is not currently in 11.1.038.

I will keep you posted.

ron
0 Kudos
Reply