- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For bitwise operations there should not be any difference between using different kind of vector type as input.
Then why is there a different intrinsics for epi32 and epi64? And when choosing one of them will make any difference?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In looking back over some old forum issues that didn't get addressed, I came across this one and was curious. So, I looked in the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual and found that the format of the underlying instructions are like the following one for xor:
vpxord zmm1 {k1}, zmm2, Si32(zmm3/mt)
The underlying instruction allows for masking of elements and swizzle of the zmm3 vector, which means you need separate instructions for 32 and 64 bits. In the intrinsics, there are intrinsics for when the mask and swizzle are used and for those cases where no mask or swizzle is used but both intrinsics still need to map back to the same underlying instruction. There really isn't any benefit in having an intrinsic that doesn't specify element size.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page