- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The AVX2 instruction set does not contain an ABS function for real(4) nor real(8) data types. AVX512 does.
I've notice, at least in one section of code using VTune, the compiler generates code to load a bit mask from memory to mask off the sign bit. In the sample code, the cost of the fetch of this mask from memory is 10x the cost of the other parts of the statement being executed. The suggestion I have is to generate the mask using AVX2 register-only instructions:
xor reg,reg (same reg to zero)
cmpeq reg,reg (same reg to set 1's)
srl reg(to zero sign bit and keep 1's in remainder)
and (to strip sign)
Jim Dempsey
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
or, better yet
add as integer 4/8 to shift left
srl shift right logical to /2 and insert 0 in sign
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page