Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Program with simple coarrays hangs

Arjen_Markus
Honored Contributor II
2,064 Views
Hello,

in my quest to find a solution for the problem I posted in the thread "Internal compiler error with lock/unlock", I stumbled upon another problem. Again using Intel Fortran 12.0.3.

Here is the program:

! checkscalar.f90 --

! Check some odd (erroneous) behaviour with scalar coarrays

!

program checkscalar

implicit none

logical, codimension

  • :: new_results
  • logical, codimension

  • :: ready
  • ready = .false.

    new_results = .true. ! Indicates the image has results available

    write(*,*) 'Image', this_image(), new_results

    sync all

    write(*,*) 'Image2', this_image(), new_results

    !

    ! Collect the found primes in image 1, create new tasks

    ! for all images

    !

    do while ( .not. ready )

    if ( this_image() == 1 ) then

    call collect_results

    endif

    call sleepqq(1)

    enddo

    contains

    !

    ! Subroutine to collect the results from all

    ! images (run by image 1)

    !

    subroutine collect_results

    integer :: i

    integer :: np

    integer :: maxindex

    do i = 1,num_images()

    write(*,*) 'Examine', i, new_results

    enddo

    do i = 1,num_images()

    ready = .true.

    enddo

    end subroutine collect_results

    end program checkscalar


    It does not do much: in image 1 I print the value of new_results and when the
    loop is finished I set the flag ready in all images so that they will stop. At least
    that is my intention.

    The output of one run with this program is:

    Image 6 T

    Image 1 T

    Image 3 T

    Image2 3 T

    Image 7 T

    Image2 7 T

    Image2 6 T

    Image 5 T

    Image 2 T

    Image2 2 T

    Image 4 T

    Image 8 T

    Image2 4 T

    Image2 1 T

    Examine 1 T

    Image2 8 T

    Image2 5 T

    (after a few seconds of no progress at all, I stopped the program)

    So, all 8 images start, image 1 is entering the loop, prints the value of new_results on that
    image, and then hangs - it does not get beyond this!

    Does anyone have any clues? Am I doing something wrong? (Possible, of course, but the
    program is so simple that I can not believe that.)

    Regards,

    Arjen

    0 Kudos
    25 Replies
    mecej4
    Honored Contributor III
    1,743 Views
    I know little about coarrays, but this code segment made me suspicious:

    [fortran]do while ( .not. ready )
      if ( this_image() == 1 ) then
         call collect_results
      endif
      call sleepqq(1)
    enddo
    [/fortran]
    If the loop is entered and this_image() returns a value different from 1, the loop will do nothing but consume time endlessly. What event would cause the IF block to be executed in this case?

    With a dual-core CPU, I get the output

    [bash] Image           1 T
     Image           2 T
     Image2           1 T
     Examine           1 T
     Image2           2 T
    [/bash]
    and then the program gets stuck in the DO WHILE loop. The last line printed has this_image() equal to 2, so the loop will keep calling SLEEPQQ and do nothing else.
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views

    The function this_image() returns the unique number belonging to the image.

    The intention of the above fragment is to have image 1 (or thread 1 or ...) examine the
    results produced by all images. In the reduced program I posted there are no actual results,
    but the variable that indicates there are is set to .true. at the start. The only task to be done
    in the routine collect_results is to loop over the images, print the value of new_results on that
    image and then to set the variable ready in all images so that the do-while loop stops.

    In short:
    All images except one should wait for image 1 to set "ready" and then terminate the loop
    and stop the program altogether.

    Unfortunately, image 1 seems to get stuck reading the value of new_result on image 2
    and then the program hangs.

    Regards,

    Arjen

    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,743 Views
    Regardless of the images /= 1 making itout ofthe loop, image 1 should have been able to run through the "Examine" part for all images. Arjen is pointing out that image 1 is hanging when it should not be hanging.

    Jim Dempsey
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    No news on this issue?

    Regards,

    Arjen
    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,743 Views
    Arjen,

    Sorry I cannot help further as I do not currently have PS 2011 XE, but I hope to have it shortly (a few weeks). At that point I should be able to run your example program (although not on Mac). Could you describe your test environment: processor/processors, single system/multi-system, connection, OS/OSs, bitness (32/64) version of IVF (Fortran Composer). This would help me in trying to replicate your problem.

    Jim Dempsey

    0 Kudos
    Steven_L_Intel1
    Employee
    1,743 Views
    Arjen,

    You should not use things such as SLEEPQQ in coarray applications. Use the language-provided synchronization features instead. For example:

    [fortran]! checkscalar.f90 --
    ! Check some odd (erroneous) behaviour with scalar coarrays
    !
    program checkscalar
    implicit none
    logical, codimension
  • :: new_results logical, codimension
  • :: ready ready = .false. new_results = .true. ! Indicates the image has results available write(*,*) 'Image', this_image(), new_results sync all write(*,*) 'Image2', this_image(), new_results ! ! Collect the found primes in image 1, create new tasks ! for all images ! sync all if ( this_image() == 1 ) then call collect_results endif contains ! ! Subroutine to collect the results from all ! images (run by image 1) ! subroutine collect_results integer :: i integer :: np integer :: maxindex do i = 1,num_images() write(*,*) 'Examine', i, new_results enddo end subroutine collect_results end program checkscalar[/fortran]
  • In particular, you can't count on the local "ready" being updated unless there is a synchronization point. That is probably what is doing you in.
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    Hi Jim,

    I run this program on a 64-bits Linux machine with 8 cores.
    I am using the free version of Intel Fortran for Linux, version number 12.0.3.

    Regards,

    Arjen
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views

    Hi Steve,

    thanks for these comments - still learning my way around coarrays. I will experiment with this.

    Regards,

    Arjen

    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,743 Views
    Arjen,

    >>I run this program on a 64-bits Linux machine with 8 cores

    Other than for a learning experience with coarrays there is little reasons for using coarrays on such a system - OpenMP would provide for better performance and ease in programming. The only advantage of using coarrays on that system is the static data size could be larger due to multiple process space and the local data (stack and local process allocatables) are acquired from different process virtual address space. And lack of space for local data would only be a concern if you were running 32-bit applications.

    Jim Dempsey

    0 Kudos
    Steven_L_Intel1
    Employee
    1,743 Views
    While I'll agree that you'd get better performance nowadays with OpenMP on such a system, there is a benefit to using coarrays in that the language rules are simpler and the program will scale to clusters without coding changes. I don't agree that OpenMP is easier, but coarrays definitely have a different model and you have to get your head around that first.
    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,743 Views
    Ammending my last post. Coarrays potentially make sense on your system (assumption on my part)if you have the new Intel Many Integrated Core (e.g. Knights Ferry) and if coarray programmingis more efficent than alternative means.

    Jim Dempsey
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    The main reason for experimenting with coarrays is indeed understanding the programming model.

    I tried Steve's version of my program and that works fine. However, when I introduced a loop with a "sync all" statement, only the first thread finishes - the ready variable seems not to be updated on any other image.

    Here is the program:

    ! checkscalar.f90 --

    ! Check some odd (erroneous) behaviour with scalar coarrays

    !

    program checkscalar

    implicit none

    logical, codimension

  • :: new_results
  • logical, codimension

  • :: ready
  • ready = .false.

    new_results = .true. ! Indicates the image has results available

    write(*,*) 'Image', this_image(), new_results

    sync all

    write(*,*) 'Image2', this_image(), new_results

    !

    ! Collect the found primes in image 1, create new tasks

    ! for all images

    !

    do while ( .not. ready )

    sync all

    if ( this_image() == 1 ) then

    call collect_results

    endif

    sync all

    enddo

    write(*,*) 'Image ',this_image(), ' done'

    contains

    !

    ! Subroutine to collect the results from all

    ! images (run by image 1)

    !

    subroutine collect_results

    integer :: i

    integer :: np

    integer :: maxindex

    do i = 1,num_images()

    write(*,*) 'Examine', i, new_results

    enddo

    do i = 1,num_images()

    ready = .true.

    enddo

    end subroutine collect_results

    end program checkscalar


    What I see in the output is that image 1 finishes after examining all images and the others
    never produce the message "Image n done", despite the "sync all" statements.

    Regards,

    Arjen

    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    Found the solution!

    I replaced the sync all statements in the do while loop by sync images statements:

    do while ( .not. ready )

    if ( this_image() == 1 ) then

    call collect_results

    sync images( * )

    else

    sync images( 1 )

    endif

    enddo

    and then the program finishes nicely. Now with this solution I can continue my experiments
    with the actual program(s).

    Regards,

    Arjen

    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,743 Views
    Arjen,

    A potential error (not your error) is if the compiler optimization stripped the second sync all in your do loop (i.e. sees one at top and bottom and assumes redundant). If this is the case then potentially the sync all following image 1's collect_results is never called.

    Two things to try

    a) place sync all after do loop
    b) remove first sync all in do loop (forcing sync all to follow collect_results)

    Wouldn't hurt to test both scenarios as you want to discover what is going on as opposed to simply getting the code to work (i.e. don't stop testing if first test succeeds).

    Jim Dempsey
    0 Kudos
    Steven_L_Intel1
    Employee
    1,743 Views
    The optimizer would never remove SYNC statements.

    Rather than use a covariable "ready" and testing it in a loop, use the synchronization tools provided by the language, including locks, critical sections and the various forms of SYNC. It would be an unusual coarray application that needed to use a loop for this.

    The biggest danger of learning coarrays is assuming you can simply translate concepts from shared-memory programming in the past. If you're an MPI programmer, however, it may be an easier transition as coarrays sort of look like one-way MPI.
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    I tried these things and I tried "sync all( stat = istat )" (with a write statement) to convince
    the compiler that this statement is required, but the result was still the same.

    Adding a write statement here and there does clarify _what_ is happening, but not why.
    Image 1 leaves the loop, but all other remain in the loop. It definitely looks if the "ready"
    coarray is not updated. With the "sync image" statement it is. (Actually got my original
    program to work that way)

    Regards,

    Arjen
    0 Kudos
    Steven_L_Intel1
    Employee
    1,743 Views
    There's an open bug report "Write to covariable in other image is not reflected in other image's local copy" which I think is the same as your issue. To see if that's true, try building with -Od and see if that changes the behavior.
    0 Kudos
    Steven_L_Intel1
    Employee
    1,743 Views
    I tried your program with optimization and it seemed to work ok. Another thing to try is to test ready[this_image()] which also avoids the bug.
    0 Kudos
    Arjen_Markus
    Honored Contributor II
    1,743 Views
    With the statement "do while ( .not. ready[this_image()] )" it does work,and turning offoptimisation with
    -O0 works too.

    Well, that is good to know!

    Regards,

    Arjen
    0 Kudos
    jimdempseyatthecove
    Honored Contributor III
    1,663 Views
    At least you have a reasonable work around to get you past this issue.
    When I introduce work arounds I also insert a comment with a common unique signature

    ! **hack**
    ! using "( .not. ready[this_image()] )" as opposed to "(.not. ready)"
    ! due to compiler bug

    Then later on, as I get new versions of the compiler, I can locate all the hacks and test to see if bug fixed.
    Also, when fixed, I leave the code in but conditionalized out. You never know if a bug resurfaces.

    This would seem to indicate that sync all is broken, or at least appears broken under this circumstance.

    Jim Dempsey
    0 Kudos
    Reply