- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello,
i am compiling fortran source code for use on pentium 4 cpu's with SSE2. to further enable speed all my double precision arrays are 16 byte aligned. to further improve speed i have set the
!DIR$VECTOR ALIGNED
directive to tell the compiler which arrays are aligned.
the following example however fails, because the DOUBLE PRECISION variable FAC1 is not aligned on a 16 byte boundary.
!DIR$VECTOR ALIGNED
DO I=1,N
E1(I,J)=E1(I,J)-FJAC(I,J)/FAC1
END DO
FAC1 is a parameter to a function. to force a 16 byte alignment i used the compiler option
/Qsfalign16
/Zp16
however variable FAC1 is still not aligned on a 16 byte boundary. what am i doing wrong here or can somebody tell me what to do ?
regards,
hans
here is the function:
SUBROUTINE TEST(N,FJAC,LDJAC,M1,M2,NM1,FAC1,E1,LDE1,IP1,IER,IJOB)
IMPLICIT REAL*8 (A-H,O-Z)
DIMENSION FJAC(LDJAC,N),E1(LDE1,NM1),IP1(NM1)
C
DO J=1,NM1
JM1=J+M1
DO I=1,NM1
E1(I,J)=-FJAC(I,J)
END DO
E1(J,J)=E1(J,J)+FAC1
END DO
DO J=1,M2
!DIR$VECTOR ALIGNED
DO I=1,NM1
E1(I,J)=E1(I,J)-(FJAC(I,J))/FAC1
END DO
END DO
RETURN
END
i am compiling fortran source code for use on pentium 4 cpu's with SSE2. to further enable speed all my double precision arrays are 16 byte aligned. to further improve speed i have set the
!DIR$VECTOR ALIGNED
directive to tell the compiler which arrays are aligned.
the following example however fails, because the DOUBLE PRECISION variable FAC1 is not aligned on a 16 byte boundary.
!DIR$VECTOR ALIGNED
DO I=1,N
E1(I,J)=E1(I,J)-FJAC(I,J)/FAC1
END DO
FAC1 is a parameter to a function. to force a 16 byte alignment i used the compiler option
/Qsfalign16
/Zp16
however variable FAC1 is still not aligned on a 16 byte boundary. what am i doing wrong here or can somebody tell me what to do ?
regards,
hans
here is the function:
SUBROUTINE TEST(N,FJAC,LDJAC,M1,M2,NM1,FAC1,E1,LDE1,IP1,IER,IJOB)
IMPLICIT REAL*8 (A-H,O-Z)
DIMENSION FJAC(LDJAC,N),E1(LDE1,NM1),IP1(NM1)
C
DO J=1,NM1
JM1=J+M1
DO I=1,NM1
E1(I,J)=-FJAC(I,J)
END DO
E1(J,J)=E1(J,J)+FAC1
END DO
DO J=1,M2
!DIR$VECTOR ALIGNED
DO I=1,NM1
E1(I,J)=E1(I,J)-(FJAC(I,J))/FAC1
END DO
END DO
RETURN
END
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can't force alignment of a dummy argument. You have to attack this in the caller where the associated variable is declared.
Steve
Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
steve,
thank you very much for your quick reply.
i am not 100% sure what you mean by dummy arguments? could you please explain this.
so if i understand you correct, if i have a function lets say
SUBROUTINE FUNC(A,B,C,D)
code here...
RETURN
END
where A,B,C,D are for example double precision scalars, then i have to make sure, that if i call this function, with lets say
CALL FUNC(PARAM1,PARAM2,PARAM3,PARAM4)
i have to ensure that already PARAM1 to PARAM4 are 16 byte aligned? am i correct?
what happens if i call this function with
CALL FUNC(1.0D+0, 1.0D+0, 1.0D+0, 1.0D+0)
?
regards,
hanst
thank you very much for your quick reply.
i am not 100% sure what you mean by dummy arguments? could you please explain this.
so if i understand you correct, if i have a function lets say
SUBROUTINE FUNC(A,B,C,D)
code here...
RETURN
END
where A,B,C,D are for example double precision scalars, then i have to make sure, that if i call this function, with lets say
CALL FUNC(PARAM1,PARAM2,PARAM3,PARAM4)
i have to ensure that already PARAM1 to PARAM4 are 16 byte aligned? am i correct?
what happens if i call this function with
CALL FUNC(1.0D+0, 1.0D+0, 1.0D+0, 1.0D+0)
?
regards,
hanst
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
i have to ensure that already PARAM1 to PARAM4 are 16 byte aligned? am i correct?
Yes.
As for constants - hmm, I'm not sure how you can force those to be aligned. There is a switch /Qsfalign16 that supposedly aligns the stack on 16-byte boundaries for functions, but I don't see how that would interact with arguments. You may want to consider putting those constants in variables that are then 16-byte aligned.
Steve
Yes.
As for constants - hmm, I'm not sure how you can force those to be aligned. There is a switch /Qsfalign16 that supposedly aligns the stack on 16-byte boundaries for functions, but I don't see how that would interact with arguments. You may want to consider putting those constants in variables that are then 16-byte aligned.
Steve
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page