Community
cancel
Showing results for 
Search instead for 
Did you mean: 
ernie1001
Beginner
67 Views

Is Compaq FORTRAN IO runtime library thread safe?

Does anyone know if Compaq's FORTRAN IO runtime library is thread safe?
I have encountered an intermittent problem where a program with two threads fails in a FORTRAN CLOSE(file) command about 10% of the time. The other thread seems innocuous; it has no file IO whatsoever. The failing thread is where the mathematical computations are performed; the other thread manages the Windows message que.
The problem occurs on dual-processor Xeon machines and on a P4 HT box. It never fails on a single CPU box, or when HT is turned off, or when affinity is used to confine execution to one CPU.
I have seen the error on an OPEN statement and on a WRITE statement, although at a much lower rate. I have seen no error anywhere else in my code.
Best Regards,
Ernie Harrison
0 Kudos
2 Replies
ClayB
Black Belt
67 Views

Welcome, Ernie!
Off the top of my head, I don't know if the Compaq FORTRAN IO routines are thread safe or not. From your evidence, though, I would feel very safe in guessing that they are NOT thread safe. In my experience, I don't think I've seen an IO library that was thread safe (unless designed for threads), plus, Fortran has rarely been considered a language that is threaded.
The tests you've done are good indicators that there is a threading problem. That is, when running on HT or dual-CPU systems you can generally see the errors, but when running on a single processor (even with multiple threads) you never or rarely see any problems. On the single CPU system, function calls are more like to complete before threads are swapped out of the CPU, while HT and multi-processor system have multiple threads executing code at the same time (and within the IO library at the same time). The Intel Thread Checker may be another way to tell if there are potential threading problems, but I doubt if you'd get much meaningful diagnostics except to point out that there is a problem.
Is there a multithreading compiler switch that can be used? With the Microsoft C/C++ compiler, there is the /MT flag and several variations. This will load the thread safe C library at link time. These versions have added synchronization that will impact performance, but allow for multiple threads calling any one or acombination of functions and assure proper execution.
If there is no thread safe library available, you may need to resort to putting in your own synchronization around all the IO calls that you make. OpenMP has locks already defined as part of the API, but for explicit threading, from Fortran this may mean that you'll need to create an INTERFACE to the C routine that you want to use. You'd best stick with one sync object for all the IO even though certain calling combinations my be safe together.
Please write back and let us know what you find out, what you try, and how it all turns out. Your experiences could be valuable for others that can run into this or similar issues.
-- clay
ernie1001
Beginner
67 Views

Hi Clay,

Thanks for the tip. I had linked a single-thread library and that was part of my problem. Changing to a multi-threaded library dropped the error rate by about a factor of 20. The setting in Compaq Visual Fortran is found at: Project --> settings. Go to the FORTRAN tab and set the category to Libraries. Then you can set the library type to multi-threaded.

I ran my program about 200 times and sawan error once in a file OPEN statement. Tracing into the code, I found that the EBP register was pointing to nonallocated memory. The code looked quite innocuous; there was only one subroutine call between a successful use of EBP and the one that caused the exception and all of the instructions looke to be thread safe. This suggests that the problem was in the subroutine.

The routine was for memory allocation. The value in EBP when the failure occurred was not random; it looked like it was perhaps 32k below where it had been before the routine was entered. The memory pointed to was not allocated, though.

I havn't had a chance to trace further into the code. I will report if I find anything.

Regards,

Ernie

Reply