- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am experimenting with Asynchronous I/O and am having some issues. This may be a misunderstanding on my part.
Some experiments work and some do not.
Using a single thread managing a file (on single unit) works fine.
The issue comes in when using multiple threads to a single file (same unit).
Configurations:
write, shared ID, not enclosed in critical section
single wait (some time) after all threads issues a write
write, private ID, not enclosed in critical section
wait per thread on private ID (some time) after each thread issues a write
write, shared ID, enclosed in critical section
single wait (some time) after all threads issues a write
write, private ID, enclosed in critical section
wait per thread on private ID (some time) after each thread issues a write
Note, only one I/O pending per private ID.
With shared ID, it is presumed there is a count of pending I/O requests.
Am I misunderstanding in that one should be able to have multiple I/O's pending to the same unit?
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not understanding what "doesn't work" and how it doesn't do what you expect.
Certainly, it's possible to have multiple operations in flight for a single unit, but there's no requirement in the standard that this happens (it would be conforming for subsequent operations to wait for the previous to complete.) I'd be a bit more nervous about sharing IDs across threads, though I'd think this risks doing operations out of order.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would expect the ID to act as an enqueue counter and the wait to wait until counter expires.
In this manner a single ID can handle multiple enqueues (by one thread or any number of threads).
As well as multiple ID's handle single or multiple enqueues (by one thread or any number of threads).
As well as asynchronous I/O without ID to handle multiple enqueues (by one thread or any number of threads).
And the wait with and without ID to act upon the pending I/O's issued without ID
And the wait with and with ID on the pending I/O to designated ID.
At least that is how I would expect (and how I've programmed this in many of runtime systems and operating systems).
The concept of asynchronous I/O is to permit it to be, well, asynchronous... eh.
enqueue, work, enqueue, work, enqueue, work, wait
as opposed to
enqueue, work, wait, enqueue, work, wait, enqueue, work, wait
(enqueue could be read or write)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If I understand what you are doing, in Windows clr cpp and cpp I used locking a range in a file with IO exception error handling get multiple threads and processes to write to a file. The spinwait was something else, maybe shared memory.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would say that your expectation of what the ID value should be isn't the only possibility. It's been a while since I Iooked at it, but it's not a simple counter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fortran 2003 (draft) C.6.3
31 The standard allows a user to issue a large number of asynchronous input/output requests, without
32 waiting for any of them to complete, and then wait for any or all of them.
It would be appreciative if each vendor would clarify the extent (if any) to which they have implemented asynchronous I/O.
Commentary/observations using IVF 2020 and oneAPI 2022
It appears that an asynchronous I/O statement enqueues and unbounded OpenMP task (a thread that has no team so to say). In my specific case, my write statement includes UDT I/O and where I am passing an array of UDT's. I have since replaced this with CALL statements due to issues.
In the asynchronous write (failing) model, I was using DT "buffer" as a keyword to have the UDT's write function to pack formatted writes into an internal buffer to produce compressed CSV files (remove extraneous spaces and trailing 0's). This works great using single thread with asynchronous write. Note, DT "buffer" performs no I/O. Subsequent to the buffer write, I can then issue a DT "flush", which performs the asynchronous write with ID. You may ask: When using DT"buffer" there is no actual I/O so what is asynchronous about that?
That is a good question. The (asynchronous) WRITE is actually an OpenMP task that is performing the internal WRITE's and CSV packing of say 1000 or so UDT's. And while that is going on, the main thread is free to issue another WRITE(unit,"(DT'buffer')",asynchronous='YES',...) array(sliceFrom:sliceTo). And do so all without the clutter of !$omp directives, and more complex code to construct the parallel pipeline. Here is some sketch code:
subroutine TwoStageParallelPipeline
do i=1,size(ObjectArray), stride*2
iBegin = i
iEnd = min(iBegin+stride-1, size(ObjectArray))
write(unit,"(DT 'flush1')",asynchronous='YES',ID=ID1) ObjectArray(1) ! supply UDT signature
write(unit,"(DT 'buffer1')",asynchronous='YES',ID=ID1) ObjectArray(iBegin:iEnd)
iBegin = iEnd + 1
if(iBegin < size(ObjectArray)) then
iEnd = min(iBegin+stride-1, size(ObjectArray))
write(unit,"(DT 'flush2')",asynchronous='YES',ID=ID2) ObjectArray(1) ! supply UDT signature
write(unit,"(DT 'buffer2')",asynchronous='YES',ID=ID2) ObjectArray(iBegin:iEnd)
endif
end do
write(unit,"(DT 'flush1')",asynchronous='YES',ID=ID1) ObjectArray(1) ! supply UDT signature
write(unit,"(DT 'flush2')",asynchronous='YES',ID=ID2) ObjectArray(1) ! supply UDT signature
end subroutine TwoStageParallelPipeline
The actual code will be a bit more complex, but that should provide an idea of how to simplify a parallel pipeline.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would other threads and/or processes on one computer call TwoStageParallelPipeline?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The routine can run as either: each thread unique unit, or all threads same unit (where the DT='flush' performs the actual WRITE enclosed within a !$omp critical region). In the multi-thread variant to same unit, each thread has it's own buffer and ID.
Note, UDT Object may have numerous member variables, thus the formatting of an array of such objects into a buffer can be compute intensive (thus warrant parallelization of different slices of the array to different buffers), and where the i/o performing WRITE has relatively low computational overhead with potentially high latency, thus desire for asynchronous I/O using separate ID's.
The code under test used an OpenMP ordered loop performing the non-io DT'buffer' together with an ordered section performing the flush (with internal write within critical section). It was noted that while the omp ordered loop schedule(static,1) sliced in thread order, that the ordered region did not (and also would occasionally hang).
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page