Solved: There may be a (an Intel)

e745200 · ‎02-26-2019

Hi

I came across a case of Segmentation fault occurring when running a SELECT CASE within a loop a certain number of times.

Similar cases submitted in the past had been sometimes imputed to a buggy implementation of the pthread library, when linked statically to the executable. I have some doubts this one is the same case.

First, a long time has passed since then, and I would hope that the current pthread implementations have been fixed in the meantime (but this is just a hope).

Second, the issue does not occur when using the PGI compiler, which also used the same system pthread library (but perhaps it uses it in a different way).

Third, the error occurs also when the pthread library is linked dinamically, and even when a single thread only is used.

What I can say, after a bunch of tests, is that it seems that the Intel compiler does not clean the stack when exiting from the SELECT CASE construct in a “rude way”, by branching outside of it.

I know that this may not conform to a structured programming style of coding, but: it does not to violate the standard, no warnign is issued at compile time and it works with other compilers (PGI, gfortran).

Below you can find a simple test program. It goes wrong with Intel Fortran Compilers up to Version 19.0.0.117 (and I did not find anything about this issue in the Release notes of the following updates.)
To make the Segmentation fault error show up earlier, simply reduce the stack size (e.g.: ulimit –s 1024)

Note that the issue does not occur if the case-expr is a simple scalar variable, while it occurs when it is an element of an array (and, I guess, whenever it is an expression).

You can see the almost exact proportionality of the number of branches having target outside the constructs with the number of loops performed before blowing out: just replace a number of GOTOs with simple CONTINUEs.

I suspect that a temporary variable is generated in the stack for allocating the result of the case-expr, and it is not freed when exiting the construct by jumping outside, without “politely passing through” the END SELECT statement.

I would be happy if someone could tell me if I’m wrong, and/or if there are specific guidelines (additional to the Standard and the reference manual) about the SELECT CASE construct.

In the meanwhile, seeming the construct unreliable, I had to go back to IF/ELSEIF ladders, with an annoying loss of readability.

Thanks.

      integer, parameter :: ws = 32 
      character(ws), parameter :: wd_0   = 'Word 0  '
      character(ws), parameter :: wd_1   = 'Word 1  '
      character(ws), parameter :: wd_2   = 'Word 2  '
      character(ws), parameter :: wd_3   = 'Word 3  '
      character(ws), parameter :: wd_4   = 'Word 4  '
      character(ws), parameter :: wd_5   = 'Word 5  '
      character(ws), parameter :: wd_6   = 'Word 6  '
      character(ws), parameter :: wd_7   = 'Word 7  '

      integer, parameter :: nw = 8
      character(ws) :: words(0:nw-1) 
      character(ws) :: myword  
      integer :: i, j 
      
      words = [ wd_0, wd_1, wd_2, wd_3,   &
                wd_4, wd_5, wd_6, wd_7  ] 

      do i = 0, 50000000
      j = mod(i,nw)
      myword = words(j)
      if ( mod(i,10000) == 0 ) print *, i, j, myword
      select case(words(j))
      case ( wd_0 ) ; go to 100
      case ( wd_1 ) ; go to 101
      case ( wd_2 ) ; go to 102
      case ( wd_3 ) ; go to 103
      case ( wd_4 ) ; go to 104
      case ( wd_5 ) ; go to 105
      case ( wd_6 ) ; go to 106
      case ( wd_7 ) ; go to 107
      case default  ; print *, 'word not found: ', myword
      end select

 100  go to 999
 101  go to 999
 102  go to 999
 103  go to 999
 104  go to 999
 105  go to 999
 106  go to 999
 107  go to 999
 999  continue

      end do  

      print *, 'ended' 
      end

Steve_Lionel · ‎02-26-2019

There's nothing wrong with this program - it is not branching into a block.

I am intensely puzzled at what the compiler is doing with your code as given, It is making a temporary copy of words(j) each time through the select case, and then calling for_cpstr (I guess this is a string compare routine) for each case. It then neglects to pop the temp off the stack before exiting the select construct. Why it makes a copy, I'm not sure, but even with optimization it does this.

A simple workaround is to replace words(j) in the select case with myword, since you've already assigned that from words(j). I would urge you to report this to Intel using the Online Service Center.

View solution in original post

jimdempseyatthecove · ‎02-26-2019

There may be a (an Intel) requirement that you not transfer outside the select case/end select in a similar manner as the requirement for BLOCK.

*** Note, in the IVF V19 documentation of BLOCK there appears to be contradiction:

...No transfer of control into a block from outside the block is allowed, except for the return from a procedure call. Transfers within a block or out of the block are allowed....You can only branch to an END BLOCK statement from within its BLOCK construct....

If the same restrictions (behavior) applies to select case, then for your exemplar consider:

      case default  ; print *, 'word not found: ', myword
 999  continue
      end select
      cycle

 100  go to 999
 101  go to 999
 102  go to 999
 103  go to 999
 104  go to 999
 105  go to 999
 106  go to 999
 107  go to 999

      end do

Jim Dempsey

Steve_Lionel · ‎02-26-2019

There's nothing wrong with this program - it is not branching into a block.

I am intensely puzzled at what the compiler is doing with your code as given, It is making a temporary copy of words(j) each time through the select case, and then calling for_cpstr (I guess this is a string compare routine) for each case. It then neglects to pop the temp off the stack before exiting the select construct. Why it makes a copy, I'm not sure, but even with optimization it does this.

A simple workaround is to replace words(j) in the select case with myword, since you've already assigned that from words(j). I would urge you to report this to Intel using the Online Service Center.

e745200 · ‎02-26-2019

Hi, Jim. Thanks for your answer.

However, I cannot adopt your proposal: the aim of my project is moving legacy code to a fully standard compliant form, to guarantee the maximum portability, among platforms and compilers.
Moving the 999-labelled line into the SELECT CASE constructs actually works (with the intel compiler - and many warnings are issued), but for sure violates the standard. Then the workaround (rather, a self-imposed restriction) of using myword instead of words(j) would be preferrable.

For example, using your non-standard-compliant solution and compiling using gfortran with the flag -std=f2003 raises an error (instead of the usual warning) at compile time: that flag is tighter than the corresponding Intel's -stand f03.

Actually, the standard reads :

It is permissible to branch to an end-select-stmt only from within its SELECT CASE construct.
and
It is permissible to branch to an end-block-stmt only from within its BLOCK construct.

that says clearly that one can branch to the end-*-stmt, but only from within the construct (and not from outside)

The phrasing you cite from the Intel documentation:

"You can only branch to an END BLOCK statement from within its BLOCK construct"
has some amount of ambiguity: you can read it as the standard above or as:
"from within a BLOCK [SELECTCASE] construct, you can only branch to the END BLOCK [SELECT] statement"
(and only in this case the "contradiction" you mentioned is real)

I guess that the Intel developers had the latter interpretation in mind when implementing the SELECT CASE construct (I haven't tested the BLOCK construct). I guess they thought that one - as you said too - should not branch to outside the construct. But this is a misinterpretation of the standard, I think, or, at least an additional restriction. However if jumping outside the block was not "allowed" by the intel compiler, a warning / error should be issued at compile time. And it is not.

The hint you gave me seems to confirm that the stack is cleaned only when "passing through" the END SELECT statement and not whenever and however you leave the SELECT CASE construct, as it should be.

So far, I think the Intel compiler should be fixed, not to impose strange, undocumented and unsignalled additional restrictions to the standard.

jimdempseyatthecove · ‎02-26-2019

>>The hint you gave me seems to confirm that the stack is cleaned only when "passing through" the END SELECT statement and not whenever and however you leave the SELECT CASE construct, as it should be.

That appears to be the case.

The goto out of the select block should clean up the stack.... and my hack of goto 999 inside the select block should be in error (as should the goto out of block clean the stack temporary, then a goto back into the select say to a CASE statement would result in a test against an undefined stack temporary, and potentially popping a no longer stack temporary.

My hack should be avoided.

In your modernization efforts, a potential technique to consider is:

 select case(words(j))
      case ( wd_0 ) ; call sub100
      case ( wd_1 ) ; call sub101
      case ( wd_2 ) ; call sub102
      case ( wd_3 ) ; call sub103
      case ( wd_4 ) ; call sub104
      case ( wd_5 ) ; call sub105
      case ( wd_6 ) ; call sub106
      case ( wd_7 ) ; call sub107
      case default  ; print *, 'word not found: ', myword
      end select
! no need for 999 continue here
      end do  

      print *, 'ended' 
contains
subroutine sub100
... ! whatever followed tag 100
end subroutine sub100
...
subroutine sub107
... ! whatever followed tag 107
end subroutine sub107
     end

There may be other issues using the above in the actual program, however this will permit you to use (reuse) the code placed in the contained procedures. And do so without making a difference program output (old vs new) unusable.

Jim Dempsey

FortranFan · ‎02-26-2019

e745200 wrote:
.. the aim of my project is moving legacy code to a fully standard compliant form, to guarantee the maximum portability, among platforms and compilers.
.. the workaround (rather, a self-imposed restriction) of using myword instead of words(j) would be preferrable. ..

@e745200,

Are you trying to replace computed GOTO statements in the legacy code with SELECT CASE headers only in your new standard-compliant form or something? Because the presence of GOTOs in the code you show do not make sense. If you simply follow what SELECT CASE switch instruction provides since its introduction in the Fortran 90 standard revision in your 'actual' code without further GO TO branches (e.g., GO TO 100, then an immediate jump at statement 100 with another GO TO 999), do you still face issues? That is, why not refactor legacy code to do away with GOTOs altogether?

C:\Temp>type p.f90
   integer, parameter :: ws = 32
   character(ws), parameter :: wd_0   = 'Word 0  '
   character(ws), parameter :: wd_1   = 'Word 1  '
   character(ws), parameter :: wd_2   = 'Word 2  '
   character(ws), parameter :: wd_3   = 'Word 3  '
   character(ws), parameter :: wd_4   = 'Word 4  '
   character(ws), parameter :: wd_5   = 'Word 5  '
   character(ws), parameter :: wd_6   = 'Word 6  '
   character(ws), parameter :: wd_7   = 'Word 7  '

   integer, parameter :: nw = 8
   character(ws) :: words(0:nw-1)
   character(ws) :: myword
   integer :: i, j

   words = [ wd_0, wd_1, wd_2, wd_3,   &
      wd_4, wd_5, wd_6, wd_7  ]

   do i = 0, 50000000
      j = mod(i,nw)
      myword = words(j)
      if ( mod(i,10000) == 0 ) print *, i, j, myword
      select case(words(j))
         case ( wd_0 )
         case ( wd_1 )
         case ( wd_2 )
         case ( wd_3 )
         case ( wd_4 )
         case ( wd_5 )
         case ( wd_6 )
         case ( wd_7 )
         case default  ; print *, 'word not found: ', myword
      end select
   end do

   print *, 'ended'
end

C:\Temp>ifort p.f90
Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64,
Version 19.0.2.190 Build 20190117
Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.16.27027.1
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\Temp>p.exe
           0           0 Word 0
       10000           0 Word 0
..
    50000000           0 Word 0
 ended

C:\Temp>

e745200 · ‎02-26-2019

Hi, Steve, thanks for your answer.
I'm glad you got the exact point I was trying to explain.
Yes, as I said, using the scalar variable myword does not raise the problem (I guess in this case nothing is defined on the stack).
That assignement is there to help playing with variations of the test code, to try to understand the behavior of the compiler.
After your comment I'm more confident to submit this case as a possible (?!) compiler error to the Online Service Center.
Thanks again.

e745200 · ‎02-26-2019

@jimdempseyatthecove, @FortranFan, thanks again.

Perhaps it was not clear that the test program was built in this way just to point out the strange behavior of the compiler in managing the stack created to deal with the SELECT CASE construct.
The several GOTOs were there to show the proportionality between the number of jumps and the number of iteration needed to saturate the stack (replacing them with CONTINUEs, you can change the ratio jumps/iterations).
I can understand it may seem senseless.
Nevertheless, using it I had already found that replacing GOTOs with internal subroutines works fine, I guess, because the SELECT CASE construct is exited in "the clean way" through the END SELECT, as also Jim suggested.

As far as my project is concerned, my final goal is in fact having the code fully restructured with no GOTO left.
I need to make a few changes at a time and perform consistent testing for each change session, sessions which are often distant in time.
In the meanwhile, even with the code in an intermediate state (i.e. not fully restructured yet) the program has to work.
The specific procedure I was working on when I found this problem is the heart of a very old interpreter, and I'm trying to unravel a rather complex spaghetti-code, to restructure it and making it readable and - hopefully - more understandable.
Eventually, the CASE blocks in the SELECT CASE constuct will consist for sure in a sequence of subroutine calls.
Then, your - very appreciated - suggestions are already in my plan. I appreciate them also because they make me confident that my plan is sound and correct.

Thanks again for your precious attention and cooperation.
GM

Ron_Green · ‎02-27-2019

We have a bug report to our developers on this now. Thanks e745200 for a nice reproducer!

Again about SELECT CASE nested in a loop – Stack overflow