Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Nathan_Weeks
Beginner
63 Views

incorrect types for Intrinsics for Packed Test Operations in the User and Reference Guide 15.0

In the User and Reference Guide for the Intel® C++ Compiler 15.0; the the Syntax of several intrinsics in the Intrinsics for Packed Test Operations section lists incorrect formal parameter types and/or return types:

https://software.intel.com/en-us/node/524147

Specifically:

  • For __mm256_testz_pd and _mm_testz_pd, the formal parameter types should be __mm256d and __m128d, respectively, rather than __mm256 and __mm128, and their return types should be int rather than __mm256 and __mm128
  • The return types of _mm256_testz_ps, _mm_testz_ps, _mm256_testc_pd, _mm_testc_pd, _mm256_testc_ps, and _mm_testc_ps should be int.

--
Nathan Weeks
Systems Analyst
Iowa State University -- Department of Mathematics
http://weeks.public.iastate.edu/

0 Kudos
4 Replies
63 Views

Hi Nathan,

I have confirmed the issues you reported regarding the document errors. I will submit them to the doc team to fix. Please expect it to be fixed in a future doc version.

Thanks,

Shenghong

Nathan_Weeks
Beginner
63 Views

Hi Shengdong,

A couple other issues:

1. The descriptions for _mm256_testz_pd/_mm_testz_pd, _mm256_testz_ps/_mm_testz_ps, _mm256_testc_pd/_mm_testc_pd, and _mm256_testc_ps/_mm_testc_ps are bit misleading; e.g., the following is stated for _mm256_testc_pd/_mm_testc_pd:

    The CF flag is set based on the result of a bitwise AND and logical NOT

    operation between the first and second source vectors. The corresponding

    instruction, VTESTPD, sets the CF flag if all the resulting bits are 0. If the

    resulting bits are non-zeros, the instruction clears the CF flag.

This description implies that the entire vectors are compared. However, the Intel 64 and IA-32 Architectures Software Developer's Manual indicates that the comparison is only for the sign bits:

    VTESTPD performs a bitwise comparison of all the sign bits of the

    double-precision elements in the first source operation [sic] and

    corresponding sign bits in the second source operand...  If the AND the

    source sign bits with the inverted dest sign bits produces all zeros the

    CF is set else the CF is clear.

 

    ...

 

    TEMP[255:0] <- SRC[255:0] AND NOT DEST[255:0]

    IF (TEMP[63] = TEMP[127] = TEMP[191] = TEMP[255] = 0)

       THEN CF <- 1;

       ELSE CF <- 0;

    DEST (unmodified)

The other related intrinsics in the User and Reference Guide 15.0 (_mm256_testnzc_si256, _mm256_testnzc_pd/_mm_testnzc_pd, _mm256_testnzc_ps/_mm_testnzc_ps) correctly mention that only the sign bits are compared.

2. The above description of VTESTPS/VTESTPD from the Intel 64 and IA-32 Architectures Software Developer's Manual mentions both first/second source operand and "dest", without describing how the the "SRC" and "DEST" operands map to the "s1" and "s2" operands in the corresponding intrinsics; e.g.:

int _mm256_testc_pd (__m256d s1, __m256d s2);

From experimenation, it appears that s1 == DEST and s2 == SRC. Since DEST isn't modified, I think it would be clearer to describe the VTESTPD/VTESTPS instructions using SRC1 and SRC2 instead; e.g.:

    TEMP[255:0] <- SRC2[255:0] AND NOT SRC1[255:0]

    IF (TEMP[63] = TEMP[127] = TEMP[191] = TEMP[255] = 0)

       THEN CF <- 1;

       ELSE CF <- 0;

Thanks,

-- 
Nathan Weeks
Systems Analyst
Iowa State University -- Department of Mathematics
http://weeks.public.iastate.edu/

63 Views

Hi Nathan,

Thank you for these additional reports, I can confirm these issues and will submit to doc team to fix also.

Regarding the 2nd issue of "Intel 64 and IA-32 Architectures Software Developer's Manual" (SDM), I agree it is confusing using "dest". But this is not written by same team, I will ask our doc team to contact the owner of that SDM doc team and refine them in future versions.  :)

Thanks,

Shenghong

63 Views

FYI. These issues in compiler docs are fixed. Fixes should already be visible on the web. 15.0 update 2 will have fixes. 

Thanks,

Shenghong 

Reply