Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

Bugs in Intrinsics Guide

andysem
新分销商 III
28,525 次查看

Hi,

I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):

1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.

2. __m256 _mm256_undefined_si256 () should return __m256i.

3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.

4. _mm_alignr_epi8 has two descriptions.

5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?

I didn't read all instructions so there may be more issues. I'll post if I find anything else.

PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.

0 项奖励
220 回复数
Patrick_K_Intel
1,946 次查看

Yes, you are correct, this will be resolved in the next release, which will be later this month.

0 项奖励
andysem
新分销商 III
1,946 次查看

I see there is a major update to the Intrinsics Guide. Nice job, thanks!

There is an error in the tooltip that pops up when I hover the pointer over the non-VEX instructions. Regardless of the instruction, the tooltip always says:

This intrinsic may generate the VEX-encoded instruction vpunpcklwd. If the instruction is not VEX encoded, punpcklwd may cause performance penalties if mixed with 256-bit or 512-bit instructions.

I suppose, the text should either be more generic or mention the corresponding instructions.

BTW: Is there a downloadable (standalone) version of the guide?

0 项奖励
Patrick_K_Intel
1,946 次查看

That is indeed an issue. The fix is on its way up right now, should be visible soon.

The standalone version is not available at the moment, but hopefully we'll have it ready early next year.

0 项奖励
bronxzv
新分销商 II
1,946 次查看

I just remarked a few errors with the Haswell throughput for these instructions :

VBLENDVPS/PD : should be 2 instead of 1

VMULPD/PS : should be 0.5 instead of 1

0 项奖励
andysem
新分销商 III
1,955 次查看

When I select AVX-512F in the filters, the SVML intrinsics are also listed. This doesn't happen when I select AVX-512 though.

0 项奖励
andysem
新分销商 III
1,955 次查看

Some intrinsics are missing timing information that was present in the 3.0.1 (the last standalone) version. For example, _mm_alignr_epi8 and _mm_alignr_pi8.

Are there any news on the standalone version?

0 项奖励
Patrick_K_Intel
1,955 次查看

Regarding the VBLENDVPD/PS and VMULPD/PS throughputs on Haswell, you're correct, those will be updated momentarily.

Regarding SVML intrinsics showing under AVX-512F, that was resolved a while ago, but may not have been universally visible. It should be visible soon.

Regarding missing performance data for _mm_alignr_epi8 and _mm_alignr_pi8, _mm_alignr_epi8 was indeed a mistake and will be added shortly. In the process of validating all the intrinsics performance data across multiple sources, some of the data was identified as invalid and thus removed, which was the case for _mm_alignr_pi8.

0 项奖励
Diego_Caballero
初学者
1,955 次查看

Hi,

_mm512_permutevar_epi32 / _mm512_mask_permutevar_epi32 and _mm512_alignr_epi32 are missing for KNC in the last version.

Is there any plan to include latency and throughput for KNC ISA?

 

Thank you! 

 

 

0 项奖励
Kevin_D_Intel
员工
1,955 次查看

User Patrick S. wrote about other mistakes here: http://software.intel.com/en-us/forums/topic/500971#comment-1779043. He wrote:

Patrick S. wrote:
I have also found some mistakes:

in the intrinsics guide:

 http://software.intel.com/sites/landingpage/IntrinsicsGuide/
 
the instruction _mm512_alignr_epi32 is not listed under "KNC". It is only listed under AVX-512, but KNC supports the alignr instruction.
 
The same for:

_mm512_mask_alignr_epi32/epi64
_mm512_load_ps/pd
_mm512_store_ps/pd
_mm512_fmadd_ps/pd
_mm512_fnmadd_ps/pd
_mm512_fmsub_ps/pd
_mm512_fnmsub_ps/pd

also all cast instructions like _mm512_castpd_ps are not listed under "KNC".

I guess that there a lot more mistakes, but these are the ones I remember.

0 项奖励
Diego_Caballero
初学者
1,955 次查看

Hi,

I think there is an error in the description of the algorithm of the intrinsics '_mm512_*_extpackstorelo_*' (or maybe I'm missing something):

The condition

IF (storeAddr % 64) == 0 BREAK

should be something like

IF ((addr + storeOffset * downSize) % 64) == 0 BREAK

Otherwise, the first aligned element (hi) will be written by the 'lo' intrinsic and it shouldn't according to my understanding.

 

Please, let me know if I'm wrong.

Thanks.

0 项奖励
Patrick_K_Intel
1,955 次查看

There appear to be a number of issues with KNC intrinsics, including several missing intrinsics (specifically when the name matches an AVX-512 intrinsic), and intrinsics that should be cross listed as both AVX-512 and KNC but are only listed under AVX-512. I am in the process of reviewing all KNC intrinsics and will release an update that should resolve all these issues shortly.

0 项奖励
Patrick_S_
新分销商 I
1,955 次查看

The function _mm512_fmadd233_epi32 is listed in the Intrinsics guide as a = b*c. I guess that is also a typo.

 

btw I really like the Intrinsics guide! Would it be possible that you add a button for choosing the data type (integer, floating point)? Like in the software "Intel Intrinsics Guide - v.3.01.?

Another idea for improvement would be to add a "advanced search", e.g. search for function with a special output data type (int, double and so forth). That search option would have saved me a lot of time.

0 项奖励
Patrick_K_Intel
1,955 次查看

I've just updated the Intrinsics Guide (v3.1.5). This should resolve all the KNC issues, as well as the issue with fmadd233 and extpackstorelo.

http://software.intel.com/sites/landingpage/IntrinsicsGuide/

0 项奖励
andysem
新分销商 III
1,955 次查看

_mm_sub_epi16 intrinsic is documented to correspond to phsubw instruction, while it should be psubw. The timing data is also given for phsubw instead of psubw.

0 项奖励
Vladimir_Sedach
新分销商 I
1,955 次查看

No compiler version info. For example, _mm_erfcinv_ps appeared in ICC 14.

0 项奖励
Patrick_K_Intel
1,955 次查看

I've resolved the issue with _mm_sub_epi16, the update should appear soon. I've also added the new intrinsics for xsavec, xsaves, and xrstors.

0 项奖励
Stefan_M_Intel
1,955 次查看

Great tool, some shortcomings

  1. _mm_xor_si128() says "bitwisw OR"
  2. All commands with "abs" may add information about behaviour for the value -2^(N-1) with N being bitwidth of corresponding epi type
0 项奖励
Stefan_M_Intel
1,955 次查看

Hello,

I currently use data version 3.1.6 very actively and had trouble with compiling the four intrinsics *_bslli_si128() and *_bsrli_si128(). With gcc, they only compile when I remove the b. I do not (yet) use Intel compiler, but the SW developer manual also lists those four intrinsics without b.

Intel C/C++ Compiler Intrinsic Equivalent

(V)PSLLDQ: __m128i _mm_slli_si128 ( __m128i a, int imm)

VPSLLDQ: __m256i _mm256_slli_si256 ( __m256i a, const int imm)

Intel C/C++ Compiler Intrinsic Equivalents

(V)PSRLDQ: __m128i _mm_srli_si128 ( __m128i a, int imm)

VPSRLDQ: __m256i _mm256_srli_si256 ( __m256i a, const int imm)

0 项奖励
andysem
新分销商 III
1,998 次查看

Please, specify that _mm_madd_epi16 and _mm256_madd_epi16 perform signed multiplication.

0 项奖励
Patrick_K_Intel
1,998 次查看

Stefan M. wrote:

 

Hello,

 

I currently use data version 3.1.6 very actively and had trouble with compiling the four intrinsics *_bslli_si128() and *_bsrli_si128(). With gcc, they only compile when I remove the b. I do not (yet) use Intel compiler, but the SW developer manual also lists those four intrinsics without b.

 

 

Intel C/C++ Compiler Intrinsic Equivalent

 

 

(V)PSLLDQ: __m128i _mm_slli_si128 ( __m128i a, int imm)

 

 

VPSLLDQ: __m256i _mm256_slli_si256 ( __m256i a, const int imm)

 

 

Intel C/C++ Compiler Intrinsic Equivalents

 

 

(V)PSRLDQ: __m128i _mm_srli_si128 ( __m128i a, int imm)

 

 

VPSRLDQ: __m256i _mm256_srli_si256 ( __m256i a, const int imm)

You can use either name, they perform the same functionality, although the "b" names may not be supported by GCC at this point.

 

0 项奖励
Eugen_V_
初学者
1,998 次查看

IntrinsicsGuide not working

Broken link to https://software.intel.com/en-us/sites/landingpage/IntrinsicsGuide

but https://software.intel.com/sites/landingpage/IntrinsicsGuide/ opened but say "Error Loading Data"

in debug i am find out that "https://software.intel.com/sites/landingpage/IntrinsicsGuide/files/data-3.1.6.xml" is not accessible 

but https://software.intel.com/sites/landingpage/IntrinsicsGuide/files/data-3.1.6.xml work

0 项奖励
回复