- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We recently added a Dell Poweredge R420 to our compute server mix. Because of the newer hardware we had to install Windows Server 2012 STD. To which we added the HPC addon to make it similar to our other compute servers running Windows 2003 Compute Cluster Edition.
Our existing software built with Intel Visual Compiler XE 13.1.2.190 runs without problems on the older setup. But on the new configuration we get an IO error at what seem to be random (maybe load dependent). If the application writes a very large file followed by a close then open it again we get an IOSTAT=30. The writes are to a local raid 1 disk with a controller with 1GB cache. Which is similar to our older compute servers.
Is there a compatibility issue between Visual Fortran and Windows Server 2012.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This tells you that it is Windows holding on to the handle after the close, not anything that Fortran is doing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What can we do about it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To the best of my knowledge, adding the SLEEP call after the close has worked for everyone else who has reported this, If you find adding another CLOSE helps, that is certainly not harmful, but I am skeptical that it is doing anything. I do recommend putting a SLEEP call after the initial close, not waiting for the OPEN to fail.
But do you really need to close and reopen the file immediately? Why not use REWIND?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> Why not use REWIND?
If the same app is run multiple times (e.g. from batch), and using the same name temp file, then rewind will not help. However, you might consider STATUS='SCRATCH', which presumably works out name collisions with temporary file names.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are not closing and reopening immediately. Let’s see if I can explain the application better. The main app is a form of scheduler which takes a set of working directories (5 in my test case) and sets of template files. One of which is a primary file which uses includes for the other files. The app process’s the template files writing a set into each directory with different parameters substituted in. It then spawns a control app that runs a reservoir simulator in that directory using the files. When the simulation finishes the control app notifies the main app that the directory is free and the main app repeats the process of building new files from template files and parameters, followed by another run. The primary file name usually changes each time but the other include file names may be the same between runs, just the data is different. The spawned app opens and closes the files just fine and does not show any leftover handles. It’s only the template files and the same named output files that seem to build up leftover handles. It’s like the same name in the same directory being used frequently causes the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem is probably related to Windows OS as you mentioned the growing count of orphaned handles.For more advanced troubleshooting you can use Application Verifiier with windbg and to issue !htrace command (extension) moreover please enable in Application Verifier the option which will track invalid handle usage.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This suggestion is a programming change specifically to work around this problem.
In the main app (control app), do not close the output files (nor the input files that will be reopened on next trip through loop).
In the main app, in CreateProcess call, set the bInheritHandles TRUE, and instead of passing file names on lpCommandLine, pass the text of the hex of the file handle of that file.
In the spawned process, change the file open code, such that if a file name begins with "0x" that it converts the hex code residing in the place of a file name, into a handle, and uses that instead of performing an open (CreateFile). Your Fortran code can do this by modifying the OPEN(...) to add USEROPEN=YourOpenRoutine.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Will try the application verifier suggestions. Can't change the spwaned program it's third party so unable to try that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Can't change the spwaned program it's third party so unable to try that.
Then the next possibility to ponder is to create a new folder/directory each time you run the spawned program. Still perform the delete files, then as a later cleanup pass remove the temporary folders/directories.
Hmm....
I do not know if this applies to Windows Server 2012
Assure that file indexing is turned off for the folder in which you perform your writes.
You also might look at FlushFileBuffers (http://msdn.microsoft.com/en-us/library/windows/desktop/aa364439(v=vs.85).aspx)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Having major problems installing Application Verifier on Server 2012. The older standalone one, the one in Windows 7 SDK and the one in Windows 8 SDK will not install. Will keep trying but running short of ideas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did not have any problems installing App verifier on Win 8,but I must admit that I have never used that program under Win 8 or Win Server 2012.
Do you have any error message during the installation or it simply hangs?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My schedule finally let me get back to this problem.
I followed the suggestion about using useropen and found that if I captured the handle to the open and used it after the normal Fortran close I got rid of the left over handles. Not very elegant or general purpose but lets us run this app on Server 2012.
Problem exist on all our Server 2012 box's with all the recent versions of Fortran. Using handle.exe on Server 2003 and Server 2012 with the same app and data shows that Fortran has had issue's with handles for a long time. On Server 2003 it leaves handles to folders around and on Server 2012 it leaves handles to folder and files around. I have up loaded results of running handle.exe on Server 2003 and Server 2012 to show the problem. In both cases the app ran for multiple days opening and closing files in five work directories. The Server 2012 version had the useropen patch so it would not crash. Thus the only files showing are files that used normal opens and closes.
Since we many times run multiple versions of this app on a server, the large number of handles that accumulate may account for why some of the servers need reboots because they get unresponsive after a week or so of working.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>Since we many times run multiple versions of this app on a server, the large number of handles that accumulate may account for why some of the servers need reboots because they get unresponsive after a week or so of working.>>>
Do you mean globally unresponsive(whole system is frozen)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another more suggestion to try is to use windbg and its handle related metacommands.You can at least be able to investigate what is leaking the handles.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you mean globally unresponsive(whole system is frozen)?
RDP logins take forever and in Windows Explorer clicking on Network goes off forever and shows nothing but the activity ring.
Another more suggestion to try is to use windbg and its handle related metacommands.You can at least be able to investigate what is leaking the handles.
handle.exe and the Resource Monitor in Server 2012 work fine and show that the low level code used by Fortran has not kept up with OS changes and is not always deleting handles as it should on close. We have other programs written in C# including one that runs as a service for days and do very similar things and they always have only a minimum number of handles.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I can't figure out what you think needs "keeping up with" - a CloseHandle is a CloseHandle.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Whatever Fortran is doing is not always closing the handle when close is called. Also even in Server 2003 there are multiple handles to folders left around even though although all the program does is set current directory to the folder. i.e. no opens etc for folders. I'm sure it's something Microsoft changed particularly in Server 2012 that the code in Fortran was never updated for.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't think that's a correct interpretation. Fortran closes the handle, but Windows doesn't actually do the close until sometime later. We will run some experiments and see what we can find out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks
Do you need any config information for hardware or software?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, it would really help if you could provide a self-contained test program. The snippets you provided earlier are missing some context.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok that will take a day or so.
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
