- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following code snipped is from a DO-loop over the variable I. When I compile with /O3 I get ILA2(I) overwritten with an impossible value JJ (which seem to come from the preceding I-loop), but which is semantically impossible as ISUM==1 thus JJ should also be updated.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                ISUM = 0
                DO J=ILA1(I),ILA2(I)
                        select case (MERKV(IAMA(J)))
                        case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end select
                ENDDO
                IF (ISUM .EQ. 1) THEN
                    MERKG(I) = MERKG_USED
                    ILAUF=ILAUF+1
                    
                    select case (MERKV(IV))
                    case (MERKV_UNDEFINED)
                        MERKV(IV) = MERKV_DEFINED
                    case (MERKV_PREDEFINED_UNUSED)
                        MERKV(IV) = MERKV_PREDEFINED_USED
                    end select
                    
                    ILA2(I) = JJ                                    <<<<<<<<<<----------------------- overwrite happens here
                    NREIH = NREIH+1
                    IREIH(NREIH) = I
                    
                  IF(K .LE. 3) THEN
                    CALL SIMCONV_ADDVARDEF( I, IV )
                  ENDIF
                ENDIF
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
If I replace
                        select case (MERKV(IAMA(J)))
                        case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end select
from the first DO-loop by
if (MERKV(IAMA(J)) .le. MERKV_UNDEFINED) then
                            ISUM= ISUM+1
                            IV  = IAMA(J)
                            JJ  = J
                        end if
which in the program-logic is semantically equivalent then /O3 does not result in errornous code.
regards
Tobias
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just figured out that the optimization error already happens with /O2 /Qparallel /Qpar-threshold:90 /Qvec-threshold:90
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does adding just /Qvec- to disable automatic vectorization eliminate the problem? I just found that the vectorizer appears to be broken with a similar loop with an externally set index being incremented (http://software.intel.com/en-us/forums/topic/472668).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Stuart,
you are right: adding /Qvec- eliminates the problem. So, the auto-vectorizer is broken, which IMO breaks the whole optimizer as you don't know what else is broken.
I really would like to see some comments from Intels quality management to this issue.
regards
Tobias
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, Tobias. Looks like we are in the same boat. I agree that this is serious.
Cheers,
Stuart
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please provide a complete test case and we'll be glad to look at it. There's nothing we can do with an excerpt. DIsabling the vectorizer significantly changes the code, and it might be hiding a coding error in the application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ Steve: here is Stuart's code as program, which give wrong results when compiled with
/nologo /O3 /Qparallel /Qpar-threshold:1 /Qvec-threshold:1 /module:"Release\\" /object:"Release\\" /libs:static /threads /c
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    program Console4
    implicit none
    integer*4 i,n,j,k
    
    
    integer*4, parameter :: M1B = 10
    real*8 r,F(M1B),dr,drod,rd(M1B*M1B)
    integer*4 M2(M1B)
    ! Variables
         DO I = 1, M1B
             M2(I)=I
             F(I)=I
         enddo    
        DROD=2.0
        RD= 0
        N = 0
        R = 2.0
         DO I = 1, M1B
          K = M2(I)
          DR = DROD*F(I)
          DO J = 1, K
            R = R + DR
            N = N + 1
            RD(N) = R
            R = R + DR
          END DO
         END DO        
         DO I = 1, N
             print *, RD(I)
         enddo    
         
         N=I
    end program Console4
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Output of debug version (non-optimized):
   4.00000000000000
   10.0000000000000
   18.0000000000000
   28.0000000000000
   40.0000000000000
   52.0000000000000
   66.0000000000000
   82.0000000000000
   98.0000000000000
   114.000000000000
   132.000000000000
   152.000000000000
   172.000000000000
   192.000000000000
   212.000000000000
   234.000000000000
   258.000000000000
   282.000000000000
   306.000000000000
   330.000000000000
   354.000000000000
   380.000000000000
   408.000000000000
   436.000000000000
   464.000000000000
   492.000000000000
   520.000000000000
   548.000000000000
   578.000000000000
   610.000000000000
   642.000000000000
   674.000000000000
   706.000000000000
   738.000000000000
   770.000000000000
   802.000000000000
   836.000000000000
   872.000000000000
   908.000000000000
   944.000000000000
   980.000000000000
   1016.00000000000
   1052.00000000000
   1088.00000000000
   1124.00000000000
   1162.00000000000
   1202.00000000000
   1242.00000000000
   1282.00000000000
   1322.00000000000
   1362.00000000000
   1402.00000000000
   1442.00000000000
   1482.00000000000
   1522.00000000000
Output of release version (optimized):
   4.00000000000000
   10.0000000000000
   18.0000000000000
   28.0000000000000
   40.0000000000000
   52.0000000000000
   66.0000000000000
   82.0000000000000
   98.0000000000000
   114.000000000000
   132.000000000000
   152.000000000000
   172.000000000000
   192.000000000000
   212.000000000000
   234.000000000000
   258.000000000000
   282.000000000000
   306.000000000000
   330.000000000000
   354.000000000000
   380.000000000000
   408.000000000000
   436.000000000000
   464.000000000000
   492.000000000000
   520.000000000000
   548.000000000000
   578.000000000000
   610.000000000000
   642.000000000000
   674.000000000000
   706.000000000000
   738.000000000000
   770.000000000000
   802.000000000000
   708.000000000000             <<<<<<<---- the first wrong value
   744.000000000000
   780.000000000000
   816.000000000000
   852.000000000000
   888.000000000000
   924.000000000000
   960.000000000000
   852.000000000000
   890.000000000000
   930.000000000000
   970.000000000000
   1010.00000000000
   1050.00000000000
   1090.00000000000
   1130.00000000000
   1170.00000000000
   1210.00000000000
   1090.00000000000
@Stuart: I hope you don't mind that I took your example. (I have no clue what it does, but it does it wrong in the optimized version ;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No problem. I just posted an even smaller example for Steve under my original posting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks - got it.
 
					
				
				
			
		
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
