Solved: Given the operations in the

Eric_L_ · ‎12-02-2016

Hi,

I'm rookie in using Fortran. I'm using IVF 2013 and VS 2005 to compile and test a console application written by fortran. The code is intended to transform the residues and poles of a rational function into the form of numerator and denominator. Following is my code:

SUBROUTINE res2num(res,poles,k,norder,den,num)
!*********************************************************
! Convert a rational function from pole-residue form to numerator-denominator form
! k is the constant term
!*********************************************************
    IMPLICIT NONE
    COMPLEX(KIND=8),DIMENSION(:),INTENT(IN)::res,poles
    COMPLEX(KIND=8),INTENT(IN)::k
    INTEGER,INTENT(IN)::norder
    COMPLEX(KIND=8)::num(norder+1),den(norder+1),numtmp(norder)
    COMPLEX(KIND=8)::tmp(2)
    INTEGER::i,j,cntSkip,sizenumtmp   
    
    den=1d0
    num=0d0   
    tmp=0d0
    ! Get the denominator
    DO i=1,norder
        tmp(1)=1
        tmp(2)=-poles(i)
        CALL conv(tmp,den(norder+2-i:norder+1),2,i,den(norder+1-i:norder+1))
    ENDDO
    ! Get the numerator
    DO i=1,norder
        numtmp=1d0
        cntSkip=0d0
        DO j=1,norder
            sizenumtmp=SIZE(numtmp,1)
            IF(j/=i) THEN
              tmp(1)=1
              tmp(2)=-poles(j)
              CALL conv(tmp,numtmp(norder+1-j+cntSkip:norder),2,j-cntSkip,numtmp(norder-j+cntSkip:norder))
            ELSE
                cntSkip=1
             ENDIF             
        ENDDO      
        num(2:norder+1)=res(i)*numtmp(1:norder)+num(2:norder+1)
    ENDDO    
    num(2:norder+1)=num(2:norder+1)+den(2:norder+1)*k
    num(1)=k*den(1)
        RETURN
ENDSUBROUTINE res2num

And the subroutine conv computes the convolution between two 1D vectors vecA and vecB, and outputs another 1D vector whose dimension equals to SIZE(vecA，1）+SIZE(vecB，1）-1. This conv subroutine has been tested OK if it is directly called within the main program.

However, when debugging the subroutine res2num, there is always a run time error, saying the stack was corrupted. I've tried to diagnose which line of the code cause the error and it turns out to be the line

CALL conv(tmp,numtmp(norder+1-j+cntSkip:norder),2,j-cntSkip,numtmp(norder-j+cntSkip:norder))

As the subroutine conv is already tested OK, I'm quite confused why there is such an error

p.s. I've already increase the stack reserve size to 999999999 in the project property windows.

any suggestions?

IanH · ‎12-05-2016

In subroutine conv, convOut (last argument) has a declared size of lenA + lenB, i.e. the sum of the third and fourth arguments.

Given the corresponding actual arguments, that means lenA = 2, lenB = j-cntSkip and convOut is numtmp(norder-j+cntSkip:norder).

LenA + LenB is therefore 2 + j - cntSkip.

But the size of the last actual argument is norder - (norder - j + cntSkip) + 1, or 1 + j - cntSkip.

They are not the same. The actual argument is smaller than the dummy, which is not permitted. The size mismatch then results in the subroutine trampling nearby stuff on the stack.

View solution in original post

IanH · ‎12-05-2016

Given the operations in the code, some sort of out-of-bounds array write/shape mismatch would be my first guess.

Are you running with all runtime diagnostics enabled? (Note though, that I don't think runtime diagnostics catch shape mismatches.)

Without a complete compilable and runnable example it is difficult to say more.

Line 26 has an assignment of a double precision constant to an integer variable that I'd consider questionable style.

What's sizenumtmp for?

FortranFan · ‎12-05-2016

Eric L. wrote:

.. I'm rookie in using Fortran...

In addition to the comments by IanH, hope you're familiar with some of the Fortran resources, particularly Dr Fortran blogs such as

https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/275071#comment-1548440

https://software.intel.com/en-us/blogs/2012/01/05/doctor-fortran-gets-explicit-again

https://software.intel.com/en-us/blogs/2009/03/31/doctor-fortran-in-ive-come-here-for-an-argument

As you will notice above, 'USE'ing procedures from Fortran MODULEs can go a long way in diagnosing many of the problems. It's unclear from your code snippet if you are doing so; if not, you may want to look into this.

Also, if you can strive to generate small reproducers of the issues you encounter, you will be able to solve many of them yourself. And as you share them on forums such as these, you can get faster responses than otherwise.

Good luck,

mecej4 · ‎12-05-2016

Your subroutine takes some arguments with dimension (:). The calling subroutine needs to be made aware of that. In the language of the standard, an explicit interface is required.

TimP · ‎12-05-2016

ianh wrote:

Line 26 has an assignment of a double precision constant to an integer variable that I'd consider questionable style.

I agree that I would be concerned about whether the compiler skips the integer conversion and propagates the double on down. No point in providing extra opportunities to expose compiler bugs. Then, the passing of array section to a subroutine is a likely site for allocation of temporary and possible stack overflow.

Eric_L_ · ‎12-05-2016

ianh wrote:

Given the operations in the code, some sort of out-of-bounds array write/shape mismatch would be my first guess.

Are you running with all runtime diagnostics enabled? (Note though, that I don't think runtime diagnostics catch shape mismatches.)

Without a complete compilable and runnable example it is difficult to say more.

Line 26 has an assignment of a double precision constant to an integer variable that I'd consider questionable style.

What's sizenumtmp for?

Thanks for your comments ianh. sizenumtmp is a redundant varibale which should be removed. In addition, it seems the compiler will automatically cast the datatype to integer for Line 26. After spending a whole day for debugging, eventually I found this stack corruption was indeed caused by the subroutine conv. The code of conv are shown below:

SUBROUTINE conv(a,b,lenA,lenB,convOut)
!*********************************************************
!convolution of two polynomials
! a, b, conv stores the coefficient in descending order, i.e. a(1) stores the coefficient for its highest order in polynomial a
! lenA: Length of vector a
! lenB: Length of vector b
!*********************************************************
    IMPLICIT NONE
    COMPLEX(KIND=8),DIMENSION(:),INTENT(IN)::a,b
    COMPLEX(KIND=8)::a2(lenA),b2(lenB)
    INTEGER,INTENT(IN)::lenA,lenB
    COMPLEX(KIND=8)::convOut(lenA+lenB)
    INTEGER::i,j,lenConv
    COMPLEX(KIND=8)::aVal,bVal
    
    a2=a
    b2=b
    lenConv=lenA+lenB-1    
    convOut(1:lenConv)=0d0
!    DO i=1,lenConv
!        convOut(i)=0d0
!    ENDDO
    DO i=1,lenConv
        DO j=0,i-1
            IF ((j+1)>lenA) THEN
            aVal=0d0
            ELSE
            aVal=a2(j+1)
            ENDIF
            
            IF ((i-j)>lenB) THEN
            bVal=0d0
            ELSE
            bVal=b2(i-j)
            ENDIF            
            convOut(i)=aVal*bVal+convOut(i)           
        ENDDO
    ENDDO
    RETURN 
ENDSUBROUTINE conv

Above the is revised version that eliminates the runtime error. Originally, the code for conv is almost the same except in line 19 I wrote:

convOut=0d0. Although the bug is fixed, I do not quite understand why. Any suggestions on that?

Eric_L_ · ‎12-05-2016

Tim P. wrote:

Quote:

ianh wrote:

Line 26 has an assignment of a double precision constant to an integer variable that I'd consider questionable style.

I agree that I would be concerned about whether the compiler skips the integer conversion and propagates the double on down. No point in providing extra opportunities to expose compiler bugs. Then, the passing of array section to a subroutine is a likely site for allocation of temporary and possible stack overflow.

Thanks for your suggestion Tim. During the debugging, I checked the value of cntSkip, the compiler indeed cast the data type from real to integer. But it is a good point that I should not to rely on the conversion provided by the compiler in any case. I've found the root cause for the runtime error, as I replied to ianh's comment. However, I do not have any explanation on that. I'm appreciated if you have any further suggestions.

Eric_L_ · ‎12-05-2016

mecej4 wrote:

Your subroutine takes some arguments with dimension (:). The calling subroutine needs to be made aware of that. In the language of the standard, an explicit interface is required.

Thanks for your advice, mecej4. I've double checked the calling subroutine and pretty sure an array of the same dimension is pass into the subroutine res2num.

IanH · ‎12-05-2016

In subroutine conv, convOut (last argument) has a declared size of lenA + lenB, i.e. the sum of the third and fourth arguments.

Given the corresponding actual arguments, that means lenA = 2, lenB = j-cntSkip and convOut is numtmp(norder-j+cntSkip:norder).

LenA + LenB is therefore 2 + j - cntSkip.

But the size of the last actual argument is norder - (norder - j + cntSkip) + 1, or 1 + j - cntSkip.

They are not the same. The actual argument is smaller than the dummy, which is not permitted. The size mismatch then results in the subroutine trampling nearby stuff on the stack.

jimdempseyatthecove · ‎12-06-2016

Also,

mecej4>>Your subroutine takes some arguments with dimension (:). The calling subroutine needs to be made aware of that. In the language of the standard, an explicit interface is required.

Eric>>Thanks for your advice, mecej4. I've double checked the calling subroutine and pretty sure an array of the same dimension is pass into the subroutine res2num.

mecej4's point is that for subroutines with arguments with dimension (:), that the caller must have (access to) an interface declaration to the subroutine specifying the (:).

Subroutines with arguments with dimension (:) are passed the address of an array descriptor (either the caller's or one manufactured for a slice of the callers). Whereas arguments with dimension (n) are passed the address of the lowest cell in the array.

The fact that the two arrays are the same size has no bearing on if the call functions correctly or not.

Jim Dempsey

Run-Time Check Failure #2 - Stack around the variable '.T5448_' was corrupted.