<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Coarray async I/O in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1454651#M164981</link>
    <description>&lt;P&gt;What is the expected behavior for the intel compiler of using images to read/write the same file asynchronously, in a direct-access pattern, when the different images always access different records? I attach some sample code at the bottom, compiled with ifx 2023 on Ubuntu 22.04,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;ifx -debug -threads -coarray=shared -coarray-num-images=8 -o my_caf_prog ./basic_newunit.f90&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A relevant discussion I started is &lt;A href="https://fortran-lang.discourse.group/t/parallel-asynchronous-fortran-i-o/5150/7" target="_self"&gt;here&lt;/A&gt; but I would like to know the intel compiler specifics. Some related comments:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;I ran the below code 20x in a row, and achieved the expected/hoped for output every time.&lt;/LI&gt;
&lt;LI&gt;I noted that by default, SHARED is true. Does this &lt;EM&gt;guarantee &lt;/EM&gt;that the below read/writes will not result in data corruption?&lt;/LI&gt;
&lt;LI&gt;Does this access pattern yield any I/O speedup? Idea being that if the different images are accessing different records, my hope is they can execute independently. &lt;BR /&gt;
&lt;UL&gt;
&lt;LI&gt;in practice it may be that the underlying hardware (CPU or storage device) or filesystem does not support such parallel I/O operations; are there current (easy) software-level solutions to this, or is this a "we need to wait for future hardware that might support this" kind of thing?&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Does using coarray=shared vs coarray=distributed change any of the answers above?&lt;/LI&gt;
&lt;LI&gt;Does using a single machine (with multiple processors) vs using a cluster change any of the above answers?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="fortran"&gt;program main                                                                                                                                                                                  
  implicit none                                                                                                                                                                               
  integer, parameter :: blocks_per_image = 2**16                                                                                                                                              
  integer, parameter :: block_size = 2**10                                                                                                                                                    
  real, dimension(block_size) :: x, y                                                                                                                                                         
  integer :: in_circle[*], unit[*]  ! an integer but each image has a different local copy                                                                                                    
  integer :: i, n_circle, n_total, rec_len, io_id                                                                                                                                             
  real :: step, xfrom                                                                                                                                                                         
                                                                                                                                                                                              
  n_total = blocks_per_image * block_size * num_images()                                                                                                                                      
  step = 1./real(num_images())                                                                                                                                                                
  xfrom = (this_image() - 1) * step                                                                                                                                                           
                                                                                                                                                                                              
  inquire(iolength=rec_len) in_circle, n_total                                                                                                                                                
                                                                                                                                                                                              
  open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT',recl=rec_len, asynchronous='yes')                                                                                    
                                                                                                                                                                                              
  in_circle = 0                                                                                                                                                                               
  do i=1, blocks_per_image                                                                                                                                                                    
     call random_number(x)                                                                                                                                                                    
     call random_number(y)                                                                                                                                                                    
     in_circle = in_circle + count((xfrom + step * x)** 2 + y**2 &amp;lt; 1.)                                                                                                                        
  end do                                                                                                                                                                                      
                                                                                                                                                                                              
  write(unit,rec=this_image(), asynchronous='yes') in_circle, n_total                                                                                                                         
  sync all                                                                                                                                                                                    
  close(unit) ! async operations finish before it closes                                                                                                                                      
                                                                                                                                                                                              
  ! Reset in_circle, n_total to make sure we read values                                                                                                                                      
  in_circle = 10                                                                                                                                                                              
  n_total = 10                                                        
open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT', action='READ', recl=rec_len, status='OLD', asynchronous='yes')                                                      
  read(unit,rec=this_image(), asynchronous='yes', id=io_id) in_circle, n_total                                                                                                                
  ! can in principle do computations here, so long as they don't need in_circle, n_total                                                                                                      
                                                                                                                                                                                              
  wait(unit=unit, id=io_id) ! need to wait before printing this, to let asynchronous read complete. unit specifies fileunit, id specifies which particular IO operation.                      
  write(*,*), this_image(), " reads in_circle and n_total: ", in_circle, n_total                                                                                                              
                                                                                                                                                                                              
  sync all                                                                                                                                                                                    
                                                                                                                                                                                              
  close(unit)                                                                                                                                                                                 
                                                                                                                                                                                              
end program main&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 08 Feb 2023 20:48:53 GMT</pubDate>
    <dc:creator>adenchfi</dc:creator>
    <dc:date>2023-02-08T20:48:53Z</dc:date>
    <item>
      <title>Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1454651#M164981</link>
      <description>&lt;P&gt;What is the expected behavior for the intel compiler of using images to read/write the same file asynchronously, in a direct-access pattern, when the different images always access different records? I attach some sample code at the bottom, compiled with ifx 2023 on Ubuntu 22.04,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;ifx -debug -threads -coarray=shared -coarray-num-images=8 -o my_caf_prog ./basic_newunit.f90&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A relevant discussion I started is &lt;A href="https://fortran-lang.discourse.group/t/parallel-asynchronous-fortran-i-o/5150/7" target="_self"&gt;here&lt;/A&gt; but I would like to know the intel compiler specifics. Some related comments:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;I ran the below code 20x in a row, and achieved the expected/hoped for output every time.&lt;/LI&gt;
&lt;LI&gt;I noted that by default, SHARED is true. Does this &lt;EM&gt;guarantee &lt;/EM&gt;that the below read/writes will not result in data corruption?&lt;/LI&gt;
&lt;LI&gt;Does this access pattern yield any I/O speedup? Idea being that if the different images are accessing different records, my hope is they can execute independently. &lt;BR /&gt;
&lt;UL&gt;
&lt;LI&gt;in practice it may be that the underlying hardware (CPU or storage device) or filesystem does not support such parallel I/O operations; are there current (easy) software-level solutions to this, or is this a "we need to wait for future hardware that might support this" kind of thing?&lt;/LI&gt;
&lt;/UL&gt;
&lt;/LI&gt;
&lt;LI&gt;Does using coarray=shared vs coarray=distributed change any of the answers above?&lt;/LI&gt;
&lt;LI&gt;Does using a single machine (with multiple processors) vs using a cluster change any of the above answers?&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="fortran"&gt;program main                                                                                                                                                                                  
  implicit none                                                                                                                                                                               
  integer, parameter :: blocks_per_image = 2**16                                                                                                                                              
  integer, parameter :: block_size = 2**10                                                                                                                                                    
  real, dimension(block_size) :: x, y                                                                                                                                                         
  integer :: in_circle[*], unit[*]  ! an integer but each image has a different local copy                                                                                                    
  integer :: i, n_circle, n_total, rec_len, io_id                                                                                                                                             
  real :: step, xfrom                                                                                                                                                                         
                                                                                                                                                                                              
  n_total = blocks_per_image * block_size * num_images()                                                                                                                                      
  step = 1./real(num_images())                                                                                                                                                                
  xfrom = (this_image() - 1) * step                                                                                                                                                           
                                                                                                                                                                                              
  inquire(iolength=rec_len) in_circle, n_total                                                                                                                                                
                                                                                                                                                                                              
  open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT',recl=rec_len, asynchronous='yes')                                                                                    
                                                                                                                                                                                              
  in_circle = 0                                                                                                                                                                               
  do i=1, blocks_per_image                                                                                                                                                                    
     call random_number(x)                                                                                                                                                                    
     call random_number(y)                                                                                                                                                                    
     in_circle = in_circle + count((xfrom + step * x)** 2 + y**2 &amp;lt; 1.)                                                                                                                        
  end do                                                                                                                                                                                      
                                                                                                                                                                                              
  write(unit,rec=this_image(), asynchronous='yes') in_circle, n_total                                                                                                                         
  sync all                                                                                                                                                                                    
  close(unit) ! async operations finish before it closes                                                                                                                                      
                                                                                                                                                                                              
  ! Reset in_circle, n_total to make sure we read values                                                                                                                                      
  in_circle = 10                                                                                                                                                                              
  n_total = 10                                                        
open(newunit=unit,file='output.txt',form='UNFORMATTED',access='DIRECT', action='READ', recl=rec_len, status='OLD', asynchronous='yes')                                                      
  read(unit,rec=this_image(), asynchronous='yes', id=io_id) in_circle, n_total                                                                                                                
  ! can in principle do computations here, so long as they don't need in_circle, n_total                                                                                                      
                                                                                                                                                                                              
  wait(unit=unit, id=io_id) ! need to wait before printing this, to let asynchronous read complete. unit specifies fileunit, id specifies which particular IO operation.                      
  write(*,*), this_image(), " reads in_circle and n_total: ", in_circle, n_total                                                                                                              
                                                                                                                                                                                              
  sync all                                                                                                                                                                                    
                                                                                                                                                                                              
  close(unit)                                                                                                                                                                                 
                                                                                                                                                                                              
end program main&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Feb 2023 20:48:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1454651#M164981</guid>
      <dc:creator>adenchfi</dc:creator>
      <dc:date>2023-02-08T20:48:53Z</dc:date>
    </item>
    <item>
      <title>Re: Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455331#M165007</link>
      <description>&lt;P&gt;Perhaps &lt;A href="https://www.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/compiler-reference/data-and-i-o/fortran-i-o/file-sharing-on-linux-and-macos.html" target="_blank" rel="noopener"&gt;this reference&lt;/A&gt; in the Intel Fortran Developer Guide will help. The SHARE specifier on the OPEN statement is an Intel extension.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2023 16:45:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455331#M165007</guid>
      <dc:creator>Barbara_P_Intel</dc:creator>
      <dc:date>2023-02-10T16:45:48Z</dc:date>
    </item>
    <item>
      <title>Re: Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455368#M165014</link>
      <description>&lt;P&gt;Indeed, the reference you share seems to make it clear multiple processes handling the same file is expected, with various flags, which addresses one of my questions,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;UL&gt;
&lt;LI class="sub_section_element_selectors"&gt;&lt;SPAN class="sub_section_element_selectors"&gt;I noted that by default, SHARED is true. Does this &lt;/SPAN&gt;&lt;EM class="sub_section_element_selectors"&gt;&lt;SPAN class="sub_section_element_selectors"&gt;guarantee &lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN class="sub_section_element_selectors"&gt;that the below read/writes will not result in data corruption?&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;though I would like to make it more precise. Regarding this documentation,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;
&lt;P&gt;The Fortran runtime does not coordinate file entry updates during cooperative access. The user needs to coordinate access times among cooperating processes to handle the possibility of simultaneous WRITE and REWRITE statements on the same record positions.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;To be specific on the wording:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;"does not coordinate file entry updates during cooperative access"; does this mean one has to close &amp;amp; open the file again to see "updated" records?&lt;/LI&gt;
&lt;LI&gt;"on the same record positions"; is this referring to the record number, and records are guaranteed to be in different write/storage sectors? (In which case, specifying rec=this_image() or otherwise guaranteeing different record numbers between different processes always has a deterministic outcome?) Or is "position" referring to storage sectors, and two or more records with size &amp;lt; blocksize may share the same WRITE sector, and so a simultaneous WRITE may corrupt the data or have a non-deterministic outcome?&lt;/LI&gt;
&lt;/OL&gt;</description>
      <pubDate>Fri, 10 Feb 2023 18:29:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455368#M165014</guid>
      <dc:creator>adenchfi</dc:creator>
      <dc:date>2023-02-10T18:29:12Z</dc:date>
    </item>
    <item>
      <title>Re: Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455414#M165017</link>
      <description>&lt;P&gt;forrtl: severe (47): write to READONLY file, unit -129, file B:\Users\macne\Documents\Visual Studio 2017\Projects\Program120 - ST\Console3\Console3\output.txt&lt;BR /&gt;In coarray image 1&lt;BR /&gt;Image PC Routine Line Source&lt;BR /&gt;Console3.exe 00007FF7537DF3A2 Unknown Unknown Unknown&lt;BR /&gt;Console3.exe 00007FF7537DC177 Unknown Unknown Unknown&lt;BR /&gt;KERNEL32.DLL 00007FFF0AC2163D Unknown Unknown Unknown&lt;BR /&gt;ntdll.dll 00007FFF0C2BD6F8 Unknown Unknown Unknown&lt;/P&gt;
&lt;P&gt;Press any key to continue . . .&lt;/P&gt;
&lt;P&gt;Using your settings in Windows VS -- it throws this error.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2023 21:14:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455414#M165017</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2023-02-10T21:14:12Z</dc:date>
    </item>
    <item>
      <title>Re: Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455417#M165018</link>
      <description>&lt;P&gt;I see, so the Windows case yields an error; did you compile it differently or do you have enough processors for 8 images? For reference, on my Pop! OS 22.04 (close variant of Ubuntu) machine, using a 12700K, I get the expected/hoped for output 20/20 times,&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;./my_caf_prog 
           2  reads in_circle and n_total:     65871670   536870912
           3  reads in_circle and n_total:     63695869   536870912
           5  reads in_circle and n_total:     55407149   536870912
           6  reads in_circle and n_total:     48613368   536870912
           7  reads in_circle and n_total:     38896892   536870912
           1  reads in_circle and n_total:     66933288   536870912
           4  reads in_circle and n_total:     60285902   536870912
           8  reads in_circle and n_total:     21944055   536870912
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2023 21:22:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455417#M165018</guid>
      <dc:creator>adenchfi</dc:creator>
      <dc:date>2023-02-10T21:22:04Z</dc:date>
    </item>
    <item>
      <title>Re: Coarray async I/O</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455446#M165020</link>
      <description>&lt;P&gt;Intel i7 with 16 threads.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried to make sure the settings matched your settings on compiler and linker.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2023 23:00:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Coarray-async-I-O/m-p/1455446#M165020</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2023-02-10T23:00:23Z</dc:date>
    </item>
  </channel>
</rss>

