- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've just upgraded the 2020.0.166 version and on testing this version on existing code I found some unexplained behavior.
The offending code section is shown below. The arrays are all dynamic and this code is located in the main program.
The compiles program runs ok for both WIN32 debug, x64 debug and x64 release but fails for WIN32 release.
The WIN32 release fails due to: forrt: severe(157): Program Exception - access violation
This is the problem section of code.
DO 305 ICASE=1,NLCASE
TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
DO 305 I=1,NXE
DO 305 J=1,12
305 FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
&*CFACT(ICASE)
Re-arranging the code has shown that the issue is addressing the 3-D array. If a print statement is placed in either of the two inner loop (I or J) the code run ok. Just adding print* and it runs.
Any explanation for this behavior.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've seen some issues with nested "DO nnn ..." where all the nests use the same tag.
Also, while the compiler optimization should be able to swap the I and J loops, it won't hurt to help it make the decision.
DO ICASE=1,NLCASE TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE) DO J=1,12 DO I=1,NXE FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE)) &*CFACT(ICASE) END DO END DO END DO
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've tried numerous arrangements of the loops all with the same result. Even including dummy variables.
It only works when a diagnostic print statements are included in an attempt to catch the error that then fails to occur. Works just printin a blank line.
DO 305 ICASE=1,NLCASE
TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
DO 305 I=1,NXE
print*
DO 305 J=1,12
305 FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
&*CFACT(ICASE)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you try the change of the nested DO 305 to do without 305 and the addition of the 3 END DO's after 305 FRC(...
(the 305 on that line can be removed)
If this does not work then if you can make a simplified reproducer and post it on the Bug page (button on forum section page)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the array FRC is zero before entering the lines of code shown in #1, you could try using array expressions:
frc(1:nxe,1:12) = nlcase * exfrc(1:nxe,1:12) totfact(1:nlcase) = tfac(lcurve(1:nlcase))*cfact(1:nlcase) do icase = 1, nlcase frc(1:nxe,1:12) = frc(1:nxe,1:12) + cufrc(1:nxe,1:12,icase) * totfact(icase) end do
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Moore, John wrote:It only works when a diagnostic print statements are included in an attempt to catch the error that then fails to occur. Works just printin a blank line.
It is often symptomatic of a stack corruption due to array overflow or invalid function or subroutine calls somewhere in the program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is definitely no issue with the specific code as I tried so many variations.
It works ok if I set the Fortran optimization to Minimum Size and Favor Fat Code. With the previous compile version I used Maximum Speed and Favor Fast Code with this program. However the newer compiler would appear to produce faster code with the Minimum Size option - so all good.
Just hope there's not a ticking time bomb because in my experience irregular behaviour is usually due to dubious coding.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If changing optimization options "solve the problem", your problem lies elsewhere.
LCURVE, CFACT, EXFRC,CUFRC,TFAC : Are they arrays or functions?
I persist to think that you have an array overflow probably somewhere in a subroutine or function call.
Try to enable all runtime check options.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All variables are dynamic arrays and nothing comes up when all runtime checks are enabled.
After a bit more probing, cause and effect indicate to me an fast code option optimisation issue.
Having identified that not activating the speed optimisation enabled the solution to run and the fact that it always ran in debug mode with no speed optimisation does indicate that speed optimisation is a controlling factor. Additionally the fact that a simple print* statement enables the program to run would imply that the presence of the print* statement prevents the optimisation of this section of code.
With this in mind I re-arranged i.e. eliminated the inner loop to that shown below and the program runs ok.
DO ICASE=1,NLCASE
TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
CUFAC=TOTFACT(ICASE)
DO I=1,NXE
! DO J=1,12
FRC(I,1:12)=FRC(I,1:12)+EXFRC(I,1:12)+CUFRC(I,1:12,ICASE)*CUFAC
! ENDDO
ENDDO
ENDDO
Logic would indicate that there is an issue with the speed optimisation of CUFRC, a dynamic single precision 3D array, in this section of code (using Do constructs). Is it a 3D dynamic array issue? There are about 120 dynamic arrays in the program but only one 3D array.
Replacing CUFRC with a static array scufrc in the above code by using scufrc(1:nxe,1:12,1:1)=cufrc(1:nxe,1:12,1:1) beforehand shows that a static array works and the solution runs ok with the J loop active. Then surprisingly with the J loop active cufrc worked providing scufrc(1:nxe,1:12,1:1)=cufrc(1:nxe,1:12,1:1) is still present beforehand.
Conclusion: I shall avoid where possible using Do constructs to equate arrays in future code. e.g. the original 5 line code section is now in the following 5 line format:
DO ICASE=1,NLCASE
TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
CUFAC=TOTFACT(ICASE)
FRC(1:NXE,1:12)=FRC(1:NXE,1:12)+EXFRC(1:NXE,1:12)+CUFRC(1:NXE,1:12,ICASE)*CUFAC
ENDDO
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
the reason that I suggested you swap the i and j loops to thus:
DO ICASE=1,NLCASE TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE) DO J=1,12 DO I=1,NXE FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE)) &*CFACT(ICASE) END DO END DO END DO
is because the varying of the 1st index advances one cell in memory. IOW
array(I+1,J) immediately follows array(I,J)
whereas array(I,J+1) follows array(I,J) by the size of the 1st dimension
Your original code was using strided references. This complicates opportunities for vectorization, which when compiling with optimizations, the compiler assesses, and when opertune, generates vectorized code. And in this case.... bad code.
By rearanging the indexing, you would have eliminated the strided reference, and thus may have not only eliminated the error, but also improved performance.
Your latest format, eliminating the I and J loops, will permit the compiler to see it can favorably vectorize the code.
The reason I mention this here, while this specific problem is resolved, other places in your code may be using unfavorable loop nest level order. Loop order is something you should be paying attention to.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I will throw out the usual caution that a behavior change under optimization is not necessarily an indication of invalid optimization. More often it reveals a program source bug that was hidden otherwise. The same goes for changes when you insert a print statement.
What I would sometimes do is capture the inputs to the suspect code and create a standalone test case that uses the input to perform whatever the operation was. Ideally this would be at the subroutine/function level with a "driver" main program that sets things up.
It is inappropriate to assume that some particular construct should be avoided just because, in your program that construct related to a problem.
As always, if you can't identify a program bug, send support a test case and let them poke at it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for advice and suggestions, much appreciated.
Jim, Yes changing the loop order did eliminate the conflict. The array does nothing special other than act as a fast memory dump for time dependent data so the loop order is not significant. But this is rarely the case and more often the loop order cannot be changed when e.g. manipulating matrices. What you're implying is that if the code cannot be optimised the compiler can/will produce junk if optimisation is active. This would be very worrying.
Steve, Your comment regarding the non avoidance of legal constructs is very true and thankfully so. The original Do construct works perfectly well if speed optimisation is not active or when the fixes blow are implemented.
I have two compiler side by side. Both use the same optimisation settings on the same code. When Maximum Speed Optimisation is active the latter version does not run but the earlier version does. This indicates a change in compiler behaviour with identical perfectly legal code - a bug?
The latter version would appear to be attempting to optimise a section of code that it should not whereas the earlier version is not. Each the following measures have prevented optimisation with the new compiler and enabled the original Do construct code to run.
- Adding a print* statement in either of the inner loops
- Changing the loop order
- Switching the index order of 3D array CUFRC(I,J,ICASE) to CUFRC(ICASE,I,J)
- Using array subscript expressions
- Removing the EXFRC array in line 305
In my book this indicates a compiler bug with but a simple workaround.
Although always a pain the upgrade, the latest compile is worth the effort. Indications are that the 64 bit compilation produces code that is 40% faster than the 32 bit version (both with fast code optimisation) . The older compiler did not produced such improvements.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If it is a compiler bug, the following simple code completed by the arrays and parameters declarations
DO 305 ICASE=1,NLCASE
TOTFACT(ICASE)=TFAC(LCURVE(ICASE))*CFACT(ICASE)
DO 305 I=1,NXE
DO 305 J=1,12
305 FRC(I,J)=FRC(I,J)+EXFRC(I,J)+CUFRC(I,J,ICASE)*TFAC(LCURVE(ICASE))
&*CFACT(ICASE)
should reproduce the error. Does it?
What are the values of NLCASE and NXE? Is NXE greater than 12? If it is, what happen if you set NXE to 12?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Rather than compiler bug a more appropriate description would be irregular behaviour under a specific range of circumstances. In other words "I can't demonstrate the problem in a sample of simple code".
Providing the values of nxe(1-32000) and nlcase(1-999) are within the bounds of the array declarations this code would be expected to run and it always does except when Speed optimisation is active in the latest compiler and with this particular program. Just to provide a bit of background this section of code was added to this program in the late 90's (Watcom 77 compiler) and has never been touched since that time. Other parts of the code has obviously been changed but not these specific lines. The program has since been compiled on Compaq/Intel compilers.
I have re-created this section of code in a small program using the same data declarations and even reading the data from the original program prior to the crash. This test program run without issue - perfect behaviour. Further more the difference between Speed and Code Size optimisation is very evident from cpu timings. Contrary to an earlier opinion that this section of code may not be optimised, it is. 2.68, 0.32 & 0.125, non, size and speed respectively - on fire!
Here is another couple of fixes I discovered prevented the access violation error. Consider the J loop. If J is 2bit it will run providing 12 is replaced by variable. If J is 4bit it runs with either the 12 or a variable. That makes 7 minor changes that make this section of code behave as expected. Is this not strange?
However, this behaviour is only on one particular model. On another validation model I was surprised to see that the model's solution did not converge. No violation error, the program ran, but no valid solution.
In my experience WIN32 compiled with Maximum Speed Optimisation is not reliable.
Speed optimised WIN32 will not be used again. 64 bit for future release code (long overdue).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If there is a stack corruption somewhere, the error message may be not pertinent or an array descriptor containing the bounds may be altered.
If you cannot reproduce the problem in a simple case, that reinforces the suspicion that the problem come from elsewhere.
That kind of problems are very difficult to fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I encountered a similar problem 25 years ago. The only way I found was to analyze the assembly source code. I have been able to determine it was a compiler bug.
But today, I don't know if the assembly source code generated by optimization can be understood .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
/Qipo-S generate a multi-file assembly file (ipo_out.asm)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The non-convergence I commented on is not related to the compilation of the speed optimised 32 bit code it was data related.
The bottom line is that simply removing the EXFRC array out of the original nested loops eliminates the runtime access violation error. Using static array i.e. making EXFRC static within the loop also causing no conflict. This behavior cannot be demonstrated on smaller sample code.
Reliable results are obtained for both 32 and 64 bit release code across a wide range of validation solutions when EXFRC removed indicating the code fix is stable.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page