- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Colleagues,
I need to form the cross products of a single vector with a long string of other vectors (all vectors 3D). I have not been able to see how to vectorize this process in Fortran. C++ allows insertion of assembler, so something can be hand-crafted in that compiler. But it's not portable. Any suggestions?
David
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Fortran allows the programmer to write out steps using vector syntax that the compiler could, in principle, translate directly into efficient SIMD code. However, I have never seen ifort perform such a translation. Can you perhaps be a bit more specific as to the KINDs of your data and how they are laid out in memory?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All the vectors involved use 4-byte floats. The layout in memory can be changed to make vectorization possible. But, for example, in other vectorized operations in the same section of code, the data is already laid out Vector(1:N, 1:3) rather than Vector(1:3,1:N). That is all the x coordinates are in contiguous memory. As with the y and z-coordinates.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you explain in a little more detail what your in's and out's are. From my read (which may be incorrect). Pseudo code
LongVectorOf3PointVectors
Crossing3PointVector
For Each 3PointVectorReference v in LongVectorOf3PointVectors
v = v cross Crossing3PointVector
end for
IOW you seem to be asking for an in situ cross. Is this correct?
Or, is the output going to a different LongVectorOf3PointVectors?
You might want to make a variation on this:
! in some module ! AVX four-up double vector type TypeYMM SEQUENCE real(8) :: v(0:3) end type TypeYMM ... RECURSIVE SUBROUTINE CROSS_ymm_ymm_ymm(VA,VB,VC) use MOD_ALL ! contains types type(TypeYMM) :: VA(3), VB(3), VC(3) type(TypeYMM), automatic :: VS(3) if((loc(VC) .eq. loc(VA)) .or. (loc(VC) .eq. loc(VB))) then ! COMPUTE THE CROSS PRODUCT !DEC$ VECTOR ALWAYS VS(1)%v = VA(2)%v*VB(3)%v - VA(3)%v*VB(2)%v !DEC$ VECTOR ALWAYS VS(2)%v = VA(3)%v*VB(1)%v - VA(1)%v*VB(3)%v !DEC$ VECTOR ALWAYS VS(3)%v = VA(1)%v*VB(2)%v - VA(2)%v*VB(1)%v ! TRANSFER THE CROSS PRODUCT RESULT TO THE OUTPUT VECTOR !DEC$ VECTOR ALWAYS VC(1)%v = VS(1)%v !DEC$ VECTOR ALWAYS VC(2)%v = VS(2)%v !DEC$ VECTOR ALWAYS VC(3)%v = VS(3)%v else ! COMPUTE THE CROSS PRODUCT without temps !DEC$ VECTOR ALWAYS VC(1)%v = VA(2)%v*VB(3)%v - VA(3)%v*VB(2)%v !DEC$ VECTOR ALWAYS VC(2)%v = VA(3)%v*VB(1)%v - VA(1)%v*VB(3)%v !DEC$ VECTOR ALWAYS VC(3)%v = VA(1)%v*VB(2)%v - VA(2)%v*VB(1)%v endif END SUBROUTINE CROSS_ymm_ymm_ymm
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The obvious thing to do is
cross_product = reshape([single(2)*vector(:,3)-single(3)*vector(:,2),single(3)*vector(:,1)-single(1)*vector(:,3),single(1)*vector(:,2)-single(2)*vector(:,1)],shape(vector))
which gives the compiler maximum information and freedom to vectorize. However, don't be surprised if given sufficient rope, the compiler manages to hang itself. Are you targeting systems with xmm, ymm, or zmm registers?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In a bored moment I ran some quick cross product tests. Vectorisation ( for real(4) ) did speed it up but even without I could do 600,000 cross products in less than 1ms on a basic laptop (I5 processor). Is it worth worrying about?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page