- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the User and Reference Guide for the Intel C++ Compiler 15.0, the description of the AVX2 intrinsics _mm256_packus_epi16/32 states (relevant words highlighted in bold):
The _mm256_packus_epi16 intrinsic converts 16 packed unsigned word integers from source operands a and b into 32 packed unsigned byte integers. The _mm256_packus_epi32 intrinsic converts eight packed unsigned doubleword integers from the source operands a and b into 16 packed unsigned word integers.
The signedness of the source words/doublewords in the description disagrees with the first sentence of the summary:
Pack signed word/doubleword integers to unsigned byte/word integers and saturates.
...as well as the description of the corresponding AVX2 instructions (VPACKUSWB and VPACKUSDW) in the Intel 64 and IA-32 Architectures Software Developer's Manual, which states:
Converts 4, 8 or 16 signed word integers from the destination operand (first operand) and 4,
8 or 16 signed word integers from the source operand (second operand) into 8, 16 or 32 unsigned byte integers
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for letting us know about this conflict.
I've checked with our compiler intrinsics expert, the user's guide info is wrong. Here is the details: The inputs to the _mm256_packus_epi16 intrinsic are vectors of signed integers. The significance of this is that negative values, i.e. values in the range [0x8000, 0xFFFF], are saturated to 0. (If the inputs were vectors of unsigned integers, these values would be saturated to 0xFF.) The same applies to _mm256_packus_epi32.
A ticket (DPD200365002) is filed to address the User's guide issue.
Thanks,
Jennifer

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page