Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Wrong behavior of sync all for coarray application

FlyingHermes
New Contributor I
1,389 Views

Here is a short coarray code which gives a wrong output.

! This program gives a wrong output as we would expect the line witht eh '*' character to be the last line to be printed.
! Compilation: ifort -coarray main.f90; ./a.out
Program Main
  implicit none
  write(*,"('[Main]: Before sync all: This_Image() = ',i3,' / ',g0)") This_Image(), Num_Images()
  sync all
  if ( This_Image() == 1 ) write(*,"('[Main]: **************************************** < SYNCHRONIZATION')")
End Program

This code gives the following output:

$ ifort -coarray main.f90; ./a.out
[Main]: Before sync all: This_Image() =   1 / 20
[Main]: Before sync all: This_Image() =   3 / 20
[Main]: Before sync all: This_Image() =   6 / 20
[Main]: Before sync all: This_Image() =   7 / 20
[Main]: Before sync all: This_Image() =  15 / 20
[Main]: Before sync all: This_Image() =  12 / 20
[Main]: Before sync all: This_Image() =  13 / 20
[Main]: Before sync all: This_Image() =  14 / 20
[Main]: **************************************** < SYNCHRONIZATION
[Main]: Before sync all: This_Image() =   2 / 20
[Main]: Before sync all: This_Image() =   4 / 20
[Main]: Before sync all: This_Image() =   9 / 20
[Main]: Before sync all: This_Image() =   8 / 20
[Main]: Before sync all: This_Image() =  16 / 20
[Main]: Before sync all: This_Image() =  19 / 20
[Main]: Before sync all: This_Image() =  17 / 20
[Main]: Before sync all: This_Image() =   5 / 20
[Main]: Before sync all: This_Image() =  10 / 20
[Main]: Before sync all: This_Image() =  11 / 20
[Main]: Before sync all: This_Image() =  20 / 20
[Main]: Before sync all: This_Image() =  18 / 20

which is clearly wrong since we would expect the lines with the '*' character to be the last one to be printed.

Ifort version is

$ ifort -v
ifort version 14.0.3

Thanks

0 Kudos
12 Replies
FlyingHermes
New Contributor I
1,389 Views

Here is another example of synchonization failure using the "sync image" statement

Program Main
  implicit none
  integer       ::      me, ne
  me    =       This_Image()
  ne    =       Num_Images()
! ================================================================================
  sync all
  if ( me == 1) write(*,"('[Main]: Test image ordering in ascending order')")
  sync all
  if ( me == 1) then
    write(*,"('[Main]: Going up: ',i3,' / ',g0)") me, ne
  else
    sync images (me-1)
    write(*,"('[Main]: Going up: ',i3,' / ',g0)") This_Image(), Num_Images()
  end if
  if (me<ne) sync images (me+1)
! ================================================================================
  sync all
  if ( me == 1) write(*,"('[Main]: Test image ordering in descending order')")
  sync all
  if ( me == ne) then
    write(*,"('[Main]: Going down: ',i3,' / ',g0)") me, ne
  else
    sync images (me+1)
    write(*,"('[Main]: Going down: ',i3,' / ',g0)") This_Image(), Num_Images()
  end if
  if (me>1) sync images (me-1)
End Program

I've tested this code with different optimization levels (-O0/1/2/3) and each time, I observe a kind of erratic behavior.

Sometimes the results are ok, somethimes there are not.

Here an example of the output:

$ ifort -coarray -O0 main.f90; ./a.out
[Main]: Test image ordering in ascending order
[Main]: Going up:   1 / 20
[Main]: Going up:   3 / 20
[Main]: Going up:   6 / 20
[Main]: Going up:   7 / 20
[Main]: Going up:   4 / 20
[Main]: Going up:   2 / 20
[Main]: Going up:   5 / 20
[Main]: Going up:   8 / 20
[Main]: Going up:   9 / 20
[Main]: Going up:  10 / 20
[Main]: Going up:  11 / 20
[Main]: Going up:  12 / 20
[Main]: Going up:  13 / 20
[Main]: Going up:  14 / 20
[Main]: Going up:  15 / 20
[Main]: Going up:  16 / 20
[Main]: Going up:  17 / 20
[Main]: Going up:  18 / 20
[Main]: Going up:  19 / 20
[Main]: Going up:  20 / 20
[Main]: Test image ordering in descending order
[Main]: Going down:  20 / 20
[Main]: Going down:  19 / 20
[Main]: Going down:  18 / 20
[Main]: Going down:  17 / 20
[Main]: Going down:  16 / 20
[Main]: Going down:  15 / 20
[Main]: Going down:  14 / 20
[Main]: Going down:  13 / 20
[Main]: Going down:  12 / 20
[Main]: Going down:  11 / 20
[Main]: Going down:  10 / 20
[Main]: Going down:   9 / 20
[Main]: Going down:   8 / 20
[Main]: Going down:   7 / 20
[Main]: Going down:   6 / 20
[Main]: Going down:   5 / 20
[Main]: Going down:   4 / 20
[Main]: Going down:   3 / 20
[Main]: Going down:   2 / 20
[Main]: Going down:   1 / 20

 

0 Kudos
Izaak_Beekman
New Contributor II
1,389 Views

I have observed similar problems with output to the terminal under coarray execution. I’m wondering if this issue could be related to io buffering?

0 Kudos
Kevin_D_Intel
Employee
1,389 Views

I am able to reproduce the described behavior using the 14.0.3 release. With the latest IPS XE 2015 (15.0 compiler), I am able to reproduce the described behavior using the first smaller example but have not yet reproduced the behavior with the second example. I may not have run it enough times yet to see the incorrect output so I'll keep trying and will also roll this up to the coarray developer for further analysis and let you know what they say.

0 Kudos
Izaak_Beekman
New Contributor II
1,389 Views

FWIW I saw this behavior with the most recent beta

0 Kudos
FlyingHermes
New Contributor I
1,389 Views

I should add that, so far, my coarray codes seems to work fine regarding synchronization, both internally and for IO on external files.

So, I believe that, as Izaak pointed out, this may be related to io buffering to the terminal.

0 Kudos
Izaak_Beekman
New Contributor II
1,389 Views

Yeah, in my code, each print statement wouldn’t even be protected… some portions from one statement would get mixed in with those from the other.

0 Kudos
FlyingHermes
New Contributor I
1,389 Views

Are you experiencing this only for terminal output (unit 6) only or also on external files ?

0 Kudos
Izaak_Beekman
New Contributor II
1,389 Views

Haven’t tried on external files, just the terminal (6)

0 Kudos
Kevin_D_Intel
Employee
1,389 Views

After some additional testing I reproduced the unexpected results for both programs when using the latest 15.0 compiler. I reported this to the Development (see internal track id below) for additional investigation and will let you know what I hear back.

(Internal tracking id: DPD200360993)

(Resolution Update on 10/16/2014): Closed as "not a defect" see subsequent replies below.

0 Kudos
Kevin_D_Intel
Employee
1,389 Views

A Developer asked whether flush helped, so I tried and it seems to.

Adding a FLUSH 6 statement before line 6 (sync all) in the first short program appears to ensure the SYNCHRONIZATION print appears last.

Similarly, for the second program, adding multiple FLUSH 6 statements before each sync all and sync images statements (except the sync all at line 7), seems to ensure the expected output ordering.

I posted the results to the internal tracking record and will let you know what I hear further from Development about this behavior.

0 Kudos
Izaak_Beekman
New Contributor II
1,389 Views

Huh, IIRC, I tried adding flush and was having issues, but maybe I should try again. Thanks for the update.

0 Kudos
Kevin_D_Intel
Employee
1,389 Views

More from the Developer, to quote them:

I believe this is the result of a misunderstanding about IO.   The standard specifies the in-program semantics of IO but not what the external file system does to implement those actions.  In particular, file systems on most platforms do a great deal of buffering of IO and can introduce delays.

The standard requires that:

1.   SYNC ALL statements order the segments of different images;
2.   In any one image, IO to a particular unit is ordered based on the order of the calls made to do IO.

But IO to different units in one image is not necessarily ordered at the file-system level and IO to different units in different images is definitely not intrinsically ordered.  The individual images have separate stdout and stderr units, so they are writing to different units in different images and the outputs are therefore not ordered.

Another way of saying this is that when an image WRITEs to unit 6 (stdout), it is writing to its own individual stdout stream.  SYNC ALL doesn't impose an ordering on the IO streams of the different images.

As a convenience, the Fortran run-time merges the stdout and stderr output from all the images and shows then to the terminal which originated the application.

This quote from the standard may help:

--[ NOTE 9.15 ]---

Even though OUTPUT UNIT is connected to a separate file on each image, it is expected that the processor could merge the sequences of records from these files into a single sequence of records that is sent to the physical device associated with this unit, such as the user’s terminal. If ERROR UNIT is associated with the same physical device, the sequences of records from files connected to ERROR UNIT on each of the images could be merged into the same sequence generated from the OUTPUT UNIT files. Otherwise, it is expected that the sequence of records in the files connected to ERROR UNIT on each image could be merged into a single sequence of records that is sent to the physical device associated with ERROR UNIT.

---[ end quote ]---

If the user wishes to have IO ordered in a predictable way, one way to do this is for one image to be the designated IO image.  All other images would send it details about the IO they wanted to do (e.g. via assignments to coarray elements containing strings and values) and it would do the IO. Then the IO will be ordered based on SYNC ALL.

Let me know your thoughts and whether you concur this does not appear to be a defect.

0 Kudos
Reply