Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
29317 Discussions

Access Violation WIN32(Debug), X64(Debug) & x64(Rel) ok but not WIN32(Rel)

Moore__John1
Beginner
3,353 Views

I've just upgraded the 2020.0.166 version and on testing this version on existing code I found some unexplained behavior.

The offending code section is shown below.  The arrays are all dynamic and this code is located in the main program.

The compiles program runs ok for both WIN32 debug, x64 debug and x64 release but fails for WIN32 release. 

The WIN32 release fails due to: forrt: severe(157): Program  Exception - access violation

This is the problem section of code.

      DO 305 ICASE=1,NLCASE
      TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
      DO 305 I=1,NXE
      DO 305 J=1,12
 305  FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
     &*CFACT(ICASE)

Re-arranging the code has shown that the issue is addressing the 3-D array. If a print statement is placed in either of the two inner loop (I or J) the code run ok. Just adding print*  and it runs.

Any explanation for this behavior.

 

0 Kudos
17 Replies
jimdempseyatthecove
Honored Contributor III
3,353 Views

I've seen some issues with nested "DO nnn ..." where all the nests use the same tag.
Also, while the compiler optimization should be able to swap the I and J loops, it won't hurt to help it make the decision.

       DO ICASE=1,NLCASE
         TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
         DO J=1,12
           DO I=1,NXE
             FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
     &*CFACT(ICASE)
           END DO
         END DO
       END DO

Jim Dempsey

0 Kudos
Moore__John1
Beginner
3,353 Views

I've tried numerous arrangements of the loops all with the same result. Even including dummy variables.

It only works when a diagnostic print statements are included in an attempt to catch the error that then fails to occur. Works just printin a blank line.

        DO 305 ICASE=1,NLCASE
      TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
      DO 305 I=1,NXE
          print*
      DO 305 J=1,12
 305  FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
     &*CFACT(ICASE)

0 Kudos
jimdempseyatthecove
Honored Contributor III
3,353 Views

Did you try the change of the nested DO 305 to do without 305 and the addition of the 3 END DO's after 305 FRC(...

(the 305 on that line can be removed)

If this does not work then if you can make a simplified reproducer and post it on the Bug page (button on forum section page)

Jim Dempsey

0 Kudos
mecej4
Honored Contributor III
3,353 Views

If the array FRC is zero before entering the lines of code shown in #1, you could try using array expressions:

frc(1:nxe,1:12) = nlcase * exfrc(1:nxe,1:12)
totfact(1:nlcase) = tfac(lcurve(1:nlcase))*cfact(1:nlcase)

do icase = 1, nlcase
   frc(1:nxe,1:12) = frc(1:nxe,1:12) + cufrc(1:nxe,1:12,icase) * totfact(icase)
end do

 

0 Kudos
GVautier
New Contributor III
3,353 Views

Moore, John wrote:

It only works when a diagnostic print statements are included in an attempt to catch the error that then fails to occur. Works just printin a blank line.

It is often symptomatic of a stack corruption due to array overflow or invalid function or subroutine calls somewhere in the program.

0 Kudos
Moore__John1
Beginner
3,353 Views

There is definitely no issue with the specific code as I tried so many variations.

It works ok if I set the Fortran optimization to Minimum Size and Favor Fat Code.  With the previous compile version I used Maximum Speed and Favor Fast Code with this program. However the newer compiler would appear to produce faster code with the Minimum Size option - so all good.

Just hope there's not a ticking time bomb because in my experience irregular behaviour is usually due to dubious coding. 

0 Kudos
GVautier
New Contributor III
3,353 Views

If changing optimization options "solve the problem", your problem lies elsewhere.

LCURVE, CFACT, EXFRC,CUFRC,TFAC : Are they arrays or functions?

I persist to think that you have an array overflow probably somewhere in a subroutine or function call.

Try to enable all runtime check options.

 

 

0 Kudos
Moore__John1
Beginner
3,353 Views

All variables are dynamic arrays and nothing comes up when all runtime checks are enabled.

After a bit more probing, cause and effect indicate to me an fast code option optimisation issue.

Having identified that not activating the speed optimisation enabled the solution to run and the fact that it always ran in debug mode with no speed optimisation does indicate that speed optimisation is a controlling factor. Additionally the fact that a simple print* statement enables the program to run would imply that the presence of the print* statement prevents the optimisation of this section of code.

With this in mind I re-arranged i.e. eliminated the inner loop to that shown below and the program runs ok.

      DO ICASE=1,NLCASE
       TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
       CUFAC=TOTFACT(ICASE)
       DO I=1,NXE
   !     DO J=1,12
         FRC(I,1:12)=FRC(I,1:12)+EXFRC(I,1:12)+CUFRC(I,1:12,ICASE)*CUFAC
   !     ENDDO
       ENDDO
      ENDDO  

Logic would indicate that there is an issue with the speed optimisation of CUFRC, a dynamic single precision 3D array, in this section of code (using Do constructs). Is it a 3D dynamic array issue? There are about 120 dynamic arrays in the program but only one 3D array.

Replacing CUFRC with a static array scufrc in the above code by using scufrc(1:nxe,1:12,1:1)=cufrc(1:nxe,1:12,1:1) beforehand shows that a static array works and the solution runs ok with the J loop active. Then surprisingly with the J loop active cufrc worked providing scufrc(1:nxe,1:12,1:1)=cufrc(1:nxe,1:12,1:1) is still present beforehand. 

Conclusion:  I shall avoid where possible using Do constructs to equate arrays in future code. e.g. the original 5 line code section is now in the following 5 line format:

      DO ICASE=1,NLCASE
       TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
       CUFAC=TOTFACT(ICASE)
       FRC(1:NXE,1:12)=FRC(1:NXE,1:12)+EXFRC(1:NXE,1:12)+CUFRC(1:NXE,1:12,ICASE)*CUFAC
       ENDDO 

 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
3,353 Views

John,

the reason that I suggested you swap the i and j loops to thus:

  DO ICASE=1,NLCASE
    TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
    DO J=1,12
      DO I=1,NXE
        FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
&*CFACT(ICASE)
      END DO
    END DO
  END DO

is because the varying of the 1st index advances one cell in memory. IOW

array(I+1,J) immediately follows array(I,J)
whereas array(I,J+1) follows array(I,J) by the size of the 1st dimension

Your original code was using strided references. This complicates opportunities for vectorization, which when compiling with optimizations, the compiler assesses, and when opertune, generates vectorized code. And in this case.... bad code.

By rearanging the indexing, you would have eliminated the strided reference, and thus may have not only eliminated the error, but also improved performance.

Your latest format, eliminating the I and J loops, will permit the compiler to see it can favorably vectorize the code.

The reason I mention this here, while this specific problem is resolved, other places in your code may be using unfavorable loop nest level order. Loop order is something you should be paying attention to.

Jim Dempsey

0 Kudos
Steve_Lionel
Honored Contributor III
3,353 Views

I will throw out the usual caution that a behavior change under optimization is not necessarily an indication of invalid optimization. More often it reveals a program source bug that was hidden otherwise. The same goes for changes when you insert a print statement.

What I would sometimes do is capture the inputs to the suspect code and create a standalone test case that uses the input to perform whatever the operation was. Ideally this would be at the subroutine/function level with a "driver" main program that sets things up.

It is inappropriate to assume that some particular construct should be avoided just because, in your program that construct related to a problem.

As always, if you can't identify a program bug, send support a test case and let them poke at it.

0 Kudos
Moore__John1
Beginner
3,353 Views

Thanks for advice and suggestions, much appreciated.

Jim,  Yes changing the loop order did eliminate the conflict. The array does nothing special other than act as a fast memory dump for time dependent data so the loop order is not significant.  But this is rarely the case and more often the loop order cannot be changed when e.g. manipulating matrices. What you're implying is that if the code cannot be optimised the compiler can/will produce junk if optimisation is active.  This would be very worrying.

Steve, Your comment regarding the non avoidance of legal constructs is very true and thankfully so. The original Do construct works perfectly well if speed optimisation is not active or when the fixes blow are implemented.

I have two compiler side by side. Both use the same optimisation settings on the same code. When Maximum Speed Optimisation is active the latter version does not run but the earlier version does. This indicates a change in compiler behaviour with identical perfectly legal code - a bug?

The latter version would appear to be attempting to optimise a section of code that it should not whereas the earlier version is not.  Each the following measures have prevented optimisation with the new compiler and enabled the original Do construct code to run.

  • Adding a print* statement in either of the inner loops
  • Changing the loop order
  •  Switching the index order of 3D array CUFRC(I,J,ICASE) to CUFRC(ICASE,I,J)
  •  Using array subscript expressions
  •  Removing the EXFRC array in line 305 

In my book this indicates a compiler bug with but a simple workaround. 

Although always a pain the upgrade, the latest compile is worth the effort.  Indications are that the 64 bit compilation produces code that is 40% faster than the 32 bit version (both with fast code optimisation) . The older compiler did not produced such improvements.

 

 

0 Kudos
GVautier
New Contributor III
3,353 Views

If it is a compiler bug, the following simple code completed by the arrays and parameters declarations

      DO 305 ICASE=1,NLCASE
      TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
      DO 305 I=1,NXE
      DO 305 J=1,12
 305  FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
     &*CFACT(ICASE)

should reproduce the error. Does it?

What are the values of NLCASE and NXE? Is NXE greater than 12? If it is, what happen if you set NXE to 12?

0 Kudos
Moore__John1
Beginner
3,353 Views

Rather than compiler bug a more appropriate description would be irregular behaviour under a specific range of circumstances. In other words "I can't demonstrate the problem in a sample of simple code".

Providing the values of nxe(1-32000) and nlcase(1-999) are within the bounds of the array declarations this code would be expected to run and it always does except when Speed optimisation is active in the latest compiler and with this particular program. Just to provide a bit of background this section of code was added to this program in the late 90's (Watcom 77 compiler) and has never been touched since that time.  Other parts of the code has obviously been changed but not these specific lines. The program has since been compiled on Compaq/Intel compilers.

I have re-created this section of code in a small program using the same data declarations and even reading the data from the original program prior to the crash.  This test program run without issue - perfect behaviour. Further more the difference between Speed and Code Size optimisation is very evident from cpu timings. Contrary to an earlier opinion that this section of code may not be optimised, it is.  2.68, 0.32 & 0.125, non, size and speed respectively - on fire! 

Here is another couple of fixes I discovered prevented the access violation error. Consider the J loop. If J is 2bit it will run providing 12 is replaced by variable. If J is 4bit it runs with either the 12 or a variable. That makes 7 minor changes that make this section of code behave as expected. Is this not strange?

However, this behaviour is only on one particular model. On another validation model I was surprised to see that the model's solution did not converge. No violation error, the program ran, but no valid solution.

In my experience WIN32 compiled with Maximum Speed Optimisation is not reliable. 

Speed optimised WIN32 will not be used again. 64 bit for future release code (long overdue).

 

 

 

 

0 Kudos
GVautier
New Contributor III
3,353 Views

If there is a stack corruption somewhere, the error message may be not  pertinent or an array descriptor containing the bounds may be altered.

If you cannot reproduce the problem in a simple case, that reinforces the suspicion that the problem come from elsewhere.

That kind of problems are very difficult to fix.

0 Kudos
GVautier
New Contributor III
3,353 Views

I encountered a similar problem 25 years ago. The only way I found was to analyze the assembly source code. I have been able to determine it was a compiler bug.

But today, I don't know if the assembly source code generated by optimization can be understood .

0 Kudos
jimdempseyatthecove
Honored Contributor III
3,353 Views

/Qipo-S   generate a multi-file assembly file (ipo_out.asm)

Jim Dempsey

0 Kudos
Moore__John1
Beginner
3,353 Views

The non-convergence I commented on is not related to the compilation of the speed optimised 32 bit code it was data related.

The bottom line is that simply removing the EXFRC array out of the original nested loops eliminates the runtime access violation error. Using static array i.e. making EXFRC static within the loop also causing no conflict. This behavior cannot be demonstrated on smaller sample code. 

Reliable results are obtained for both 32 and 64 bit release code across a wide range of validation solutions when EXFRC removed indicating the code fix is stable.

0 Kudos
Reply