bitwise epi32 vs epi64

ale3 · ‎02-04-2013

For bitwise operations there should not be any difference between using different kind of vector type as input.

Then why is there a different intrinsics for epi32 and epi64? And when choosing one of them will make any difference?

Frances_R_Intel · ‎08-11-2015

In looking back over some old forum issues that didn't get addressed, I came across this one and was curious. So, I looked in the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual and found that the format of the underlying instructions are like the following one for xor:

vpxord zmm1 {k1}, zmm2, Si32(zmm3/mt)

The underlying instruction allows for masking of elements and swizzle of the zmm3 vector, which means you need separate instructions for 32 and 64 bits. In the intrinsics, there are intrinsics for when the mask and swizzle are used and for those cases where no mask or swizzle is used but both intrinsics still need to map back to the same underlying instruction. There really isn't any benefit in having an intrinsic that doesn't specify element size.