- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have an issue with SDE emulating _mm512_permutevar_ps() [aka VPERMPS] in an unexpected way. I understand from the documentation that it should behave as the 512 bit variants of _mm256_permutevar8x32_ps(), and be able to do cross-lane shuffling. So the attached file should reverse the content of the vector. It works with _mm256_permutevar8x32_ps(), but _mm512_permutevar_ps() clearly doesn't produce the expected results, but rather an intra-lane shuffling:
iv: iv = 0 1 2 3 4 5 6 7 dv: dv = 7.000 6.000 5.000 4.000 3.000 2.000 1.000 0.000 pv: pv = 0.000 1.000 2.000 3.000 4.000 5.000 6.000 7.000 iv: iv = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 dv: dv = 15.000 14.000 13.000 12.000 11.000 10.000 9.000 8.000 7.000 6.000 5.000 4.000 3.000 2.000 1.000 0.000 pv: pv = 12.000 13.000 14.000 15.000 8.000 9.000 10.000 11.000 4.000 5.000 6.000 7.000 0.000 1.000 2.000 3.000
Is the emulation wrong, or did I misunderstand something ?
Cordially,
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found the problem. The documentation at <https://software.intel.com/en-us/node/485351> is wrong, as it claims that "_mm512_permutevar_ps
" "Shuffle float32 elements across lanes." It doesn't (unlike _mm512_permutevar_epi32, which does...)
, per "https://software.intel.com/sites/landingpage/IntrinsicsGuide/". The intrinsics that permute accross lane is "_mm512_permutexvar_ps
".
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page