Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
7783 Discussions

AVX2 _mm256_packus_epi16/32 intrinsics documentation lists incorrect signedness


In the User and Reference Guide for the Intel C++ Compiler 15.0, the description of the AVX2 intrinsics _mm256_packus_epi16/32  states (relevant words highlighted in bold):

The _mm256_packus_epi16 intrinsic converts 16 packed unsigned word integers from source operands a and b into 32 packed unsigned byte integers. The _mm256_packus_epi32 intrinsic converts eight packed unsigned doubleword integers from the source operands a and b into 16 packed unsigned word integers.

The signedness of the source words/doublewords in the description disagrees with the first sentence of the summary:

Pack signed word/doubleword integers to unsigned byte/word integers and saturates. well as the description of the corresponding AVX2 instructions (VPACKUSWB and VPACKUSDW) in the Intel 64 and IA-32 Architectures Software Developer's Manualwhich states:

Converts 4, 8 or 16 signed word integers from the destination operand (first operand) and 4, 

8 or 16 signed word integers from the source operand (second operand) into 8, 16 or 32 unsigned byte integers

0 Kudos
1 Reply

Thanks for letting us know about this conflict. 

I've checked with our compiler intrinsics expert, the user's guide info is wrong. Here is the details:  The inputs to the _mm256_packus_epi16 intrinsic are vectors of signed integers. The significance of this is that negative values, i.e. values in the range [0x8000, 0xFFFF], are saturated to 0. (If the inputs were vectors of unsigned integers, these values would be saturated to 0xFF.) The same applies to _mm256_packus_epi32.

A ticket (DPD200365002) is filed to address the User's guide issue.