Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Miscounting of DO loop counter

h_amini
Beginner
756 Views

Hi there

In the following loops, the values of the counters are dependent on the commands inside the loops. For instance if line 20 is commented it works well, but if line 10 is commented instead, I1 will be always 1 and I2 will vary from 1 to 150 and then access violation occurs as the counter exceeds the arrays' size.

DO I1 = 1, NRCONCELEM ! Loop over concrete elements

DO I2 = 1, INTPT ! Loop over Gauss points

10 print*,'hi',I1,I2

!20 print*,I1,I2

EPSD_ = EPSD(ELEMNRCONC(I1),I2)

EPSR_ = EPSR(ELEMNRCONC(I1),I2)

EPSL_ = EPSL(ELEMNRCONC(I1),I2)

EPST_ = EPST(ELEMNRCONC(I1),I2)

CALL FOS(SF, EPSD_, EPSR_, EPSL_, EPST_, EPSDPER, EPSRPER,

& EPSLPER, EPSTPER) ! Calculate SF

END DO

END DO

Similar problem happens if line 10 is commented only but the arrays are echoed afterwards by:

PRINT*, EPSD(ELEMNRCONC(I1),I2); PRINT*, EPSR(ELEMNRCONC(I1),I2)

PRINT*, EPSL(ELEMNRCONC(I1),I2); PRINT*, EPST(ELEMNRCONC(I1),I2)

Interestingly if the forth PRINT is omitted it again works well!

I use Visual Fortran Compiler Ver. 11.0.072 on IA-32 and would appreciate your comments very much.

Just as a guess, can using a large number of variables be the reason? If so, what is the solution?

Many thanks

Hamid

0 Kudos
1 Solution
Tim_Gallagher
New Contributor II
756 Views
Quoting - h.amini

I'm sorry but I don't know self-contained test. I couldn't still find it on the web. Please waitor if you can please explain. I'm not big in programming!

Many thanks
Hamid


Usually the loop counters get messed up when you have a memory corruption somewhere. The easiest ways to check would be to compile with the -C option and see if you are accessing an array out of bounds. The next problem, and the one that I usually see when I have these errors, is to make sure all the subroutines you are calling have explicit interfaces. If you call a subroutine and pass it arrays that have different ranks or sizes of the dimensions or the wrong number/type of arguments, it will show errors like that.

Tim

View solution in original post

0 Kudos
14 Replies
Steven_L_Intel1
Employee
756 Views

It is impossible to analyze such a problem from a small code excerpt. A buildable and runnable test case would be needed. However, I'll comment that I have seen such issues before when there is a calling convention mismatch, such as calling a STDCALL routine without declaring it as such to Fortran. Also, an argument type or size mismatch can cause such problems. Does commenting out the call to FOS make this particular error go away?
0 Kudos
h_amini
Beginner
756 Views

It is impossible to analyze such a problem from a small code excerpt. A buildable and runnable test case would be needed. However, I'll comment that I have seen such issues before when there is a calling convention mismatch, such as calling a STDCALL routine without declaring it as such to Fortran. Also, an argument type or size mismatch can cause such problems. Does commenting out the call to FOS make this particular error go away?

Many thanks Steve.

Yes commenting out CALL FOS solves the problem. Also if I define these tow loops inside a subroutine the problem is solved for all the mentioned cases:


SUBROUTINE FA_DIANA(NRCONCELEM, INTPT, EPSD_, EPSR_, EPSL_, EPST_,

& SF, EPSDPER, EPSRPER, EPSLPER, EPSTPER)

USE GLOBALDATA

IMPLICIT NONE

! Variables

DOUBLE PRECISION SF, EPSD_, EPSR_, EPSL_, EPST_, EPSDPER, EPSRPER,

& EPSLPER, EPSTPER

INTEGER, INTENT(INOUT) :: INTPT

INTEGER I1, I2

INTEGER(kind=4), INTENT(INOUT) :: NRCONCELEM

! Fill A_DIANA

DO I1 = 1, NRCONCELEM ! Loop over concrete elements

DO I2 = 1, INTPT ! Loop over Gauss points

!print*,'hi',I1,I2

!print*,I1,I2

! PRINT*, EPSD(ELEMNRCONC(I1),I2)

! PRINT*, EPSR(ELEMNRCONC(I1),I2)

! PRINT*, EPSL(ELEMNRCONC(I1),I2)

! PRINT*, EPST(ELEMNRCONC(I1),I2)

EPSD_ = EPSD(ELEMNRCONC(I1),I2)

EPSR_ = EPSR(ELEMNRCONC(I1),I2)

EPSL_ = EPSL(ELEMNRCONC(I1),I2)

EPST_ = EPST(ELEMNRCONC(I1),I2)

CALL FOS(SF, EPSD_, EPSR_, EPSL_, EPST_, EPSDPER, EPSRPER,

& EPSLPER, EPSTPER) ! Calculate SF

IF (SF .GE. 1) THEN

A_DIANA(ELEMNRCONC(I1),I2) = 1 ! Valid solution

ELSE

A_DIANA(ELEMNRCONC(I1),I2) = 0 ! Not a valid solution

END IF

print*, ELEMNRCONC(I1), A_DIANA(ELEMNRCONC(I1),I2), SF

END DO

END DO

RETURN

END SUBROUTINE FA_DIANA

Subroutine FOS is very simple and it seems there is not any size or argument mismatch. I don't know what calling convention mismatch is, so if you think the problem is because of that could you explain it. I cannot post my program as it has about 8000 lines and works in conjunction with a commercial software. Subroutine FOS is:



SUBROUTINE FOS(SF, EPSD_, EPSR_, EPSL_, EPST_, EPSDPER, EPSRPER,

& EPSLPER, EPSTPER)

USE GLOBALDATA

IMPLICIT NONE

! Precision constants

DOUBLE PRECISION SF, EPSD_, EPSR_, EPSL_, EPST_, EPSDPER, EPSRPER,

& EPSLPER, EPSTPER, MIN_EPS, MAX_EPS,

& SF1_TEMP, SF2_TEMP, SF3_TEMP, SF4_TEMP

! Maximum and minimum strains in concrete

MIN_EPS = MIN(EPSD_,EPSR_)

MAX_EPS = MAX(EPSD_,EPSR_)

!______________________________

! Safety factors

IF (MIN_EPS .EQ. 0.D0) THEN ! Zero strain

SF1_TEMP = 1E38

ELSE ! Nonzero strain

IF (MIN_EPS .GT. 0.D0) THEN ! Tensile strain

SF1_TEMP = EPSRPER / MIN_EPS

ELSE ! Compressive strain

SF1_TEMP = EPSDPER / MIN_EPS

END IF

END IF

!______________________________

IF (MAX_EPS .EQ. 0.D0) THEN ! Zero strain

SF2_TEMP = 1E38

ELSE ! Nonzero strain

IF (MAX_EPS .GT. 0.D0) THEN ! Tensile strain

SF2_TEMP = EPSRPER / MAX_EPS

ELSE ! Compressive strain

SF2_TEMP = EPSDPER / MAX_EPS

END IF

END IF

!______________________________

IF (EPSL_ .EQ. 0.D0) THEN ! Zero strain

SF3_TEMP = 1E38

ELSE ! Nonzero strain

SF3_TEMP = EPSLPER / ABS(EPSL_)

END IF

!______________________________

IF (EPST_ .EQ. 0.D0) THEN ! Zero strain

SF4_TEMP = 1E38

ELSE ! Nonzero strain

SF4_TEMP = EPSTPER / ABS(EPST_)

END IF

!______________________________

SF = MIN(SF1_TEMP,SF2_TEMP,SF3_TEMP,SF4_TEMP)

RETURN

END SUBROUTINE FOS

Thank you!
Hamid

0 Kudos
Steven_L_Intel1
Employee
756 Views
I think the way I would approach this is to experiment with routine FOS, seeing how much of it is required for the error to appear. You could add a RETURN at the beginning, for example, and then move the RETURN further down the routine to enable more code.

You say that you are using a commercial library. Is it possible that this library was written to be used from a language that requires the STDCALL calling convention? Can you provide a pointer to documentation for it? The results you are seeing look like stack corruption to me.
0 Kudos
h_amini
Beginner
756 Views
I think the way I would approach this is to experiment with routine FOS, seeing how much of it is required for the error to appear. You could add a RETURN at the beginning, for example, and then move the RETURN further down the routine to enable more code.

You say that you are using a commercial library. Is it possible that this library was written to be used from a language that requires the STDCALL calling convention? Can you provide a pointer to documentation for it? The results you are seeing look like stack corruption to me.

Thank you again!

I am not using commercial libraries. Inside my program batch files are generated which run a commercial finite element software. Then my program reads the data file generated by the software and re-run it. It is in fact an iterative procedure.

I am checking the subroutine FOS and keep you update.

Hamid

0 Kudos
h_amini
Beginner
756 Views
Quoting - h.amini

Thank you again!

I am not using commercial libraries. Inside my program batch files are generated which run a commercial finite element software. Then my program reads the data file generated by the software and re-run it. It is in fact an iterative procedure.

I am checking the subroutine FOS and keep you update.

Hamid


Hi Steve

I've got interesting results! The problem inside FOS is solved by returning before the last line "SF = MIN(SF1_TEMP,SF2_TEMP,SF3_TEMP,SF4_TEMP)". I thought it might be cause of the large value of SF1_TEMP, SF2_TEMP, SF3_TEMP and SF4_TEMP when zero strain is obtained. So I added ";pause" after each 1E38 in FOS but another subroutine inside the program got a problem just by doing that!

I also replaced 1E38 with smaller values (1E8 & 10D.0) and got the same error.

I tried all the above things when the loops are defined in subroutine FA_DIANA and for all cases the program worked correctly. The pause commands also showed that zero strains were obtained.

What do you think about it? Is it possible that I spot a bug in the compiler?

Hamid

0 Kudos
Steven_L_Intel1
Employee
756 Views
Please see if you can construct a self-contained test case. At this point, I still think you have a problem in your code somewhere, but it may not be in the code you have shown.
0 Kudos
h_amini
Beginner
756 Views
Please see if you can construct a self-contained test case. At this point, I still think you have a problem in your code somewhere, but it may not be in the code you have shown.

I'm sorry but I don't know self-contained test. I couldn't still find it on the web. Please waitor if you can please explain. I'm not big in programming!

Many thanks
Hamid

0 Kudos
Steven_L_Intel1
Employee
756 Views

I mean a small program that has just enough to show the problem. Something I could build and run. Otherwise, you will need to debug the problem on your own since you can't provide us everything.
0 Kudos
Tim_Gallagher
New Contributor II
757 Views
Quoting - h.amini

I'm sorry but I don't know self-contained test. I couldn't still find it on the web. Please waitor if you can please explain. I'm not big in programming!

Many thanks
Hamid


Usually the loop counters get messed up when you have a memory corruption somewhere. The easiest ways to check would be to compile with the -C option and see if you are accessing an array out of bounds. The next problem, and the one that I usually see when I have these errors, is to make sure all the subroutines you are calling have explicit interfaces. If you call a subroutine and pass it arrays that have different ranks or sizes of the dimensions or the wrong number/type of arguments, it will show errors like that.

Tim
0 Kudos
h_amini
Beginner
756 Views

I mean a small program that has just enough to show the problem. Something I could build and run. Otherwise, you will need to debug the problem on your own since you can't provide us everything.

Hello again

I am working on the self-contained test. It is very time consuming, but at the moment I got a new strange error. The value of global variables, defined in 'MODULE GLOBALDATA', change if I echo them by PRINT command inside the program! They don't change if I echo them inside the subroutines.

Hamid

0 Kudos
h_amini
Beginner
756 Views
Quoting - tgallagher2114

Usually the loop counters get messed up when you have a memory corruption somewhere. The easiest ways to check would be to compile with the -C option and see if you are accessing an array out of bounds. The next problem, and the one that I usually see when I have these errors, is to make sure all the subroutines you are calling have explicit interfaces. If you call a subroutine and pass it arrays that have different ranks or sizes of the dimensions or the wrong number/type of arguments, it will show errors like that.

Tim

Many thanks Tim. Using -C resulted in the following error which I don't know what it is:

forrtl: severe (408): fort: (4): Variable has substring ending point 7

which is greater than the variable length of 6

Image PC Routine Line Source

proglinktest01_1. 004C1C9A Unknown Unknown Unknown

proglinktest01_1. 004BF389 Unknown Unknown Unknown

proglinktest01_1. 00457AAB Unknown Unknown Unknown

proglinktest01_1. 00455042 Unknown Unknown Unknown

proglinktest01_1. 0043A3EA Unknown Unknown Unknown

proglinktest01_1. 0040128A Unknown Unknown Unknown

proglinktest01_1. 004C7C43 Unknown Unknown Unknown

proglinktest01_1. 004AB1CD Unknown Unknown Unknown

kernel32.dll 7C817077 Unknown Unknown Unknown

I am double checking the variables used inside the subroutines and will inform you if there is anything wrong.

Hamid

0 Kudos
h_amini
Beginner
756 Views

Hi everyone

Just an update, disabling optimization makes the error go away. For any possible optimization level the program has problem if the loops are defined inside the program.

Hamid

0 Kudos
h_amini
Beginner
756 Views
Quoting - h.amini

Hi everyone

Just an update, disabling optimization makes the error go away. For any possible optimization level the program has problem if the loops are defined inside the program.

Hamid


Even if the optimization is disabled by/Od the global data change unexpectedly. Interestingly only echoing data can cause this problem. For example, when the following loops are defined:

DO I1 = 1, NRCONCELEM ! Loop over concrete elements

DO I2 = 1, INTPT ! Loop over Gauss points

PRINT*,I1,I2,A_DIANA(ELEMNRCONC(I1),I2),A_PROG(ELEMNRCONC(I1),I2)

END DO

END DO

the global data change and I get:

forrtl: error (72): floating overflow

Image PC Routine Line Source

proglinktest01_2. 0049AE7E Unknown Unknown Unknown

I guess in this case some global data are replaced by extremely large ore smallrandom values.

However, when the outer loop is replaced by updating a counter like i1 as:

i1 = 0

1000 i1 = i1+1

if (i1.eq.NRCONCELEM+1) go to 1001

DO I2 = 1, INTPT ! Loop over Gauss points

PRINT*,I1,I2,A_DIANA(ELEMNRCONC(I1),I2),A_PROG(ELEMNRCONC(I1),I2)

END DO

go to 1000

1001 continue

The program works perfectly.

Last year I got a fairly similar error discussed on http://software.intel.com/en-us/forums/showthread.php?t=61721 but that problem was due to uninitialized variables.

Any advice would be very muchappreciated.

I am still workingon the self-contained test as well as checking the variables inside the subroutines. Is there an straightforward way for checking the variables?

Hamid

0 Kudos
h_amini
Beginner
756 Views

Hi there

I got it! As Steve and Tim said there was a mismatch between the type of one of the arguments inside the program and one of the subroutines. I could find it defining explicit interfaces as Tim advised.

Many many thanks for your help.

Hamid

0 Kudos
Reply