Software Archive
Read-only legacy content
17061 Discussions

choose _mm512_permutevar_epi32 or _mm512_shuffle/permute4f128_epi32?

Kaixi_H_
Beginner
1,320 Views

Hi all,

It appears that the _mm512_permutevar_epi32 can perform any kinds of data-reordering patterns according to the given index vector. On the case of that, why do we need to use the _mm512_permute4128_epi32 or _mm512_shuffle_epi32 instructions to conduct the inter- or intra-lane data reordering operations? IIRC, the Xeon Phi vector architecture contains lane muxes and element muxes to perform inter- and intra-lane respectively. Therefore, even use _mm512_permutevar_epi32, the input vector should also go through the two types of muxes, is that right?

Actually, I have written a test program to reverse a given vector (from 0-16 to 16-0): one version is using one instruction of _mm512_permutevar_epi32; another version is using two instructions of _mm512_permute4128_epi32 and _mm512_shuffle_epi32 with _MM_PERM_ABCD. It seems the former version case can outperform the latter by 2 times.

Thanks,

Kaixi

0 Kudos
0 Replies
Reply