- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following code snipped is from a DO-loop over the variable I. When I compile with /O3 I get ILA2(I) overwritten with an impossible value JJ (which seem to come from the preceding I-loop), but which is semantically impossible as ISUM==1 thus JJ should also be updated.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
ISUM = 0
DO J=ILA1(I),ILA2(I)
select case (MERKV(IAMA(J)))
case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
ISUM= ISUM+1
IV = IAMA(J)
JJ = J
end select
ENDDO
IF (ISUM .EQ. 1) THEN
MERKG(I) = MERKG_USED
ILAUF=ILAUF+1
select case (MERKV(IV))
case (MERKV_UNDEFINED)
MERKV(IV) = MERKV_DEFINED
case (MERKV_PREDEFINED_UNUSED)
MERKV(IV) = MERKV_PREDEFINED_USED
end select
ILA2(I) = JJ <<<<<<<<<<----------------------- overwrite happens here
NREIH = NREIH+1
IREIH(NREIH) = I
IF(K .LE. 3) THEN
CALL SIMCONV_ADDVARDEF( I, IV )
ENDIF
ENDIF
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
If I replace
select case (MERKV(IAMA(J)))
case (MERKV_PREDEFINED_UNUSED,MERKV_UNDEFINED)
ISUM= ISUM+1
IV = IAMA(J)
JJ = J
end select
from the first DO-loop by
if (MERKV(IAMA(J)) .le. MERKV_UNDEFINED) then
ISUM= ISUM+1
IV = IAMA(J)
JJ = J
end if
which in the program-logic is semantically equivalent then /O3 does not result in errornous code.
regards
Tobias
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just figured out that the optimization error already happens with /O2 /Qparallel /Qpar-threshold:90 /Qvec-threshold:90
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does adding just /Qvec- to disable automatic vectorization eliminate the problem? I just found that the vectorizer appears to be broken with a similar loop with an externally set index being incremented (http://software.intel.com/en-us/forums/topic/472668).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Stuart,
you are right: adding /Qvec- eliminates the problem. So, the auto-vectorizer is broken, which IMO breaks the whole optimizer as you don't know what else is broken.
I really would like to see some comments from Intels quality management to this issue.
regards
Tobias
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, Tobias. Looks like we are in the same boat. I agree that this is serious.
Cheers,
Stuart
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please provide a complete test case and we'll be glad to look at it. There's nothing we can do with an excerpt. DIsabling the vectorizer significantly changes the code, and it might be hiding a coding error in the application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ Steve: here is Stuart's code as program, which give wrong results when compiled with
/nologo /O3 /Qparallel /Qpar-threshold:1 /Qvec-threshold:1 /module:"Release\\" /object:"Release\\" /libs:static /threads /c
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
program Console4
implicit none
integer*4 i,n,j,k
integer*4, parameter :: M1B = 10
real*8 r,F(M1B),dr,drod,rd(M1B*M1B)
integer*4 M2(M1B)
! Variables
DO I = 1, M1B
M2(I)=I
F(I)=I
enddo
DROD=2.0
RD= 0
N = 0
R = 2.0
DO I = 1, M1B
K = M2(I)
DR = DROD*F(I)
DO J = 1, K
R = R + DR
N = N + 1
RD(N) = R
R = R + DR
END DO
END DO
DO I = 1, N
print *, RD(I)
enddo
N=I
end program Console4
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Output of debug version (non-optimized):
4.00000000000000
10.0000000000000
18.0000000000000
28.0000000000000
40.0000000000000
52.0000000000000
66.0000000000000
82.0000000000000
98.0000000000000
114.000000000000
132.000000000000
152.000000000000
172.000000000000
192.000000000000
212.000000000000
234.000000000000
258.000000000000
282.000000000000
306.000000000000
330.000000000000
354.000000000000
380.000000000000
408.000000000000
436.000000000000
464.000000000000
492.000000000000
520.000000000000
548.000000000000
578.000000000000
610.000000000000
642.000000000000
674.000000000000
706.000000000000
738.000000000000
770.000000000000
802.000000000000
836.000000000000
872.000000000000
908.000000000000
944.000000000000
980.000000000000
1016.00000000000
1052.00000000000
1088.00000000000
1124.00000000000
1162.00000000000
1202.00000000000
1242.00000000000
1282.00000000000
1322.00000000000
1362.00000000000
1402.00000000000
1442.00000000000
1482.00000000000
1522.00000000000
Output of release version (optimized):
4.00000000000000
10.0000000000000
18.0000000000000
28.0000000000000
40.0000000000000
52.0000000000000
66.0000000000000
82.0000000000000
98.0000000000000
114.000000000000
132.000000000000
152.000000000000
172.000000000000
192.000000000000
212.000000000000
234.000000000000
258.000000000000
282.000000000000
306.000000000000
330.000000000000
354.000000000000
380.000000000000
408.000000000000
436.000000000000
464.000000000000
492.000000000000
520.000000000000
548.000000000000
578.000000000000
610.000000000000
642.000000000000
674.000000000000
706.000000000000
738.000000000000
770.000000000000
802.000000000000
708.000000000000 <<<<<<<---- the first wrong value
744.000000000000
780.000000000000
816.000000000000
852.000000000000
888.000000000000
924.000000000000
960.000000000000
852.000000000000
890.000000000000
930.000000000000
970.000000000000
1010.00000000000
1050.00000000000
1090.00000000000
1130.00000000000
1170.00000000000
1210.00000000000
1090.00000000000
@Stuart: I hope you don't mind that I took your example. (I have no clue what it does, but it does it wrong in the optimized version ;-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No problem. I just posted an even smaller example for Steve under my original posting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks - got it.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page