- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is the expected behavior for the intel compiler of using images to read/write the same file asynchronously, in a direct-access pattern, when the different images always access different records? I attach some sample code at the bottom, compiled with ifx 2023 on Ubuntu 22.04,
ifx -debug -threads -coarray=shared -coarray-num-images=8 -o my_caf_prog ./basic_newunit.f90
A relevant discussion I started is here but I would like to know the intel compiler specifics. Some related comments:
- I ran the below code 20x in a row, and achieved the expected/hoped for output every time.
- I noted that by default, SHARED is true. Does this guarantee that the below read/writes will not result in data corruption?
- Does this access pattern yield any I/O speedup? Idea being that if the different images are accessing different records, my hope is they can execute independently.
- in practice it may be that the underlying hardware (CPU or storage device) or filesystem does not support such parallel I/O operations; are there current (easy) software-level solutions to this, or is this a "we need to wait for future hardware that might support this" kind of thing?
- Does using coarray=shared vs coarray=distributed change any of the answers above?
- Does using a single machine (with multiple processors) vs using a cluster change any of the above answers?
program main
implicit none
integer, parameter :: blocks_per_image = 2**16
integer, parameter :: block_size = 2**10
real, dimension(block_size) :: x, y
integer :: in_circle[*], unit[*] ! an integer but each image has a different local copy
integer :: i, n_circle, n_total, rec_len, io_id
real :: step, xfrom
n_total = blocks_per_image * block_size * num_images()
step = 1./real(num_images())
xfrom = (this_image() - 1) * step
inquire(iolength=rec_len) in_circle, n_total
open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT',recl=rec_len, asynchronous='yes')
in_circle = 0
do i=1, blocks_per_image
call random_number(x)
call random_number(y)
in_circle = in_circle + count((xfrom + step * x)** 2 + y**2 < 1.)
end do
write(unit,rec=this_image(), asynchronous='yes') in_circle, n_total
sync all
close(unit) ! async operations finish before it closes
! Reset in_circle, n_total to make sure we read values
in_circle = 10
n_total = 10
open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT', action='READ', recl=rec_len, status='OLD', asynchronous='yes')
read(unit,rec=this_image(), asynchronous='yes', id=io_id) in_circle, n_total
! can in principle do computations here, so long as they don't need in_circle, n_total
wait(unit=unit, id=io_id) ! need to wait before printing this, to let asynchronous read complete. unit specifies fileunit, id specifies which particular IO operation.
write(*,*), this_image(), " reads in_circle and n_total: ", in_circle, n_total
sync all
close(unit)
end program main
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Perhaps this reference in the Intel Fortran Developer Guide will help. The SHARE specifier on the OPEN statement is an Intel extension.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed, the reference you share seems to make it clear multiple processes handling the same file is expected, with various flags, which addresses one of my questions,
- I noted that by default, SHARED is true. Does this guarantee that the below read/writes will not result in data corruption?
though I would like to make it more precise. Regarding this documentation,
The Fortran runtime does not coordinate file entry updates during cooperative access. The user needs to coordinate access times among cooperating processes to handle the possibility of simultaneous WRITE and REWRITE statements on the same record positions.
To be specific on the wording:
- "does not coordinate file entry updates during cooperative access"; does this mean one has to close & open the file again to see "updated" records?
- "on the same record positions"; is this referring to the record number, and records are guaranteed to be in different write/storage sectors? (In which case, specifying rec=this_image() or otherwise guaranteeing different record numbers between different processes always has a deterministic outcome?) Or is "position" referring to storage sectors, and two or more records with size < blocksize may share the same WRITE sector, and so a simultaneous WRITE may corrupt the data or have a non-deterministic outcome?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
forrtl: severe (47): write to READONLY file, unit -129, file B:\Users\macne\Documents\Visual Studio 2017\Projects\Program120 - ST\Console3\Console3\output.txt
In coarray image 1
Image PC Routine Line Source
Console3.exe 00007FF7537DF3A2 Unknown Unknown Unknown
Console3.exe 00007FF7537DC177 Unknown Unknown Unknown
KERNEL32.DLL 00007FFF0AC2163D Unknown Unknown Unknown
ntdll.dll 00007FFF0C2BD6F8 Unknown Unknown Unknown
Press any key to continue . . .
Using your settings in Windows VS -- it throws this error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see, so the Windows case yields an error; did you compile it differently or do you have enough processors for 8 images? For reference, on my Pop! OS 22.04 (close variant of Ubuntu) machine, using a 12700K, I get the expected/hoped for output 20/20 times,
./my_caf_prog
2 reads in_circle and n_total: 65871670 536870912
3 reads in_circle and n_total: 63695869 536870912
5 reads in_circle and n_total: 55407149 536870912
6 reads in_circle and n_total: 48613368 536870912
7 reads in_circle and n_total: 38896892 536870912
1 reads in_circle and n_total: 66933288 536870912
4 reads in_circle and n_total: 60285902 536870912
8 reads in_circle and n_total: 21944055 536870912
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel i7 with 16 threads.
I tried to make sure the settings matched your settings on compiler and linker.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page