Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

Bugs in Intrinsics Guide

andysem
New Contributor III
27,236 Views

Hi,

I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):

1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.

2. __m256 _mm256_undefined_si256 () should return __m256i.

3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.

4. _mm_alignr_epi8 has two descriptions.

5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?

I didn't read all instructions so there may be more issues. I'll post if I find anything else.

PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.

0 Kudos
220 Replies
andysem
New Contributor III
1,622 Views

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

0 Kudos
Bernard
Valued Contributor I
1,622 Views

andysem wrote:

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

Works for me also.

0 Kudos
Patrick_K_Intel
Employee
1,622 Views

Sorry about that, there were some server changes that caused some intermittent issues, but it should be working fine now.

0 Kudos
Xiong_Z_
Beginner
1,622 Views

description for __m128i _mm_sad_epu8 (__m128i a__m128i bis not correct,  

 

Description

Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned two unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of 64-bit elements in dst.
0 Kudos
Kevin_D_Intel
Employee
1,622 Views

User Jeremias M. wrote here: https://software.intel.com/en-us/forums/topic/516476#comment-1791398 regarding an issue filtering results only for KNC and the search returning _mm512_mask_set1_epi32 as a valid intrinsic for KNC. That is not currently incorrect. It may become true in a future release as discussed in the cited thread.

0 Kudos
Patrick_K_Intel
Employee
1,622 Views

Thank you for your feedback. I've updated the Intrinsics Guide to resolve the issues with _mm_sad_epu8 and _mm512_mask_set1_epi32, as well as a few other issues with KNC intrinsics.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/

0 Kudos
Jeremias_M_
Beginner
1,622 Views

Hi,

I was using the function _mm512_mask_reduce_gmax_pd and when I checked for the int same functions in the guide, appeared only for AVX-512 instructions.

So, I checked in zmmintrin.h header and I saw the functions implemented. Then I tested some functions( _mm512_mask_reduce_max_epi32 (__mmask16 k, __m512i a), _mm512_reduce_max_epi32 (__m512i a) ), and they worked.


I believe that it's possible the below functions were made  for KNC too.

int _mm512_reduce_max_epi32 (__m512i a)

__int64 _mm512_reduce_max_epi64 (__m512i a)

unsigned int _mm512_reduce_max_epu32 (__m512i a)

unsigned __int64 _mm512_reduce_max_epu64 (__m512i a)

double _mm512_reduce_max_pd (__m512d a)

float _mm512_reduce_max_ps (__m512 a)

 

0 Kudos
Patrick_K_Intel
Employee
1,622 Views

You are correct, all the _reduce_ intrinsics are supported on KNC. I've updated the Intrinsics Guide to resolve this issue.

0 Kudos
andysem
New Contributor III
1,622 Views

_mm_test_all_ones intrinsic has multiple different timing values for the same CPUs.

0 Kudos
andysem
New Contributor III
1,622 Views

The Intel intrinsics guide page doesn't load for me or loads really slow (about a minute or so). It shows the intrinsics categories on the left and "Loading" in the center and hangs this way. I'm using Firefox 32.0.3 on Linux.

On a related note, will there be an offline standalone release? Browser version is not always convenient for me.

 

0 Kudos
Dobratz__Glenn
1,622 Views
I find the opening screen of the guide to be very unreadable. It would be much more readable if only the function name were used at the top level instead of the full function prototypes. Using the prototypes just creates a lot of visual noise that obscures the function names. Since the prototype is easily visible when a function is displayed, IMHO, the extra click needed to see the prototype is outweighed by the improved readability.
0 Kudos
Dobratz__Glenn
1,622 Views
It would be helpful if the description of the intrinsics also had a link to the corresponding instruction's description in the Intel Processor Instruction Set manual, so we can easily get the dirty details on the generated instruction.
0 Kudos
andysem
New Contributor III
1,622 Views

Glenn D. wrote:

It would be much more readable if only the function name were used at the top level instead of the full function prototypes.

I disagree. The prototype is useful for me because I often don't remember the exact signature or arguments of the intrinsic, and all I have to do is just type it in the search field.

 

0 Kudos
andysem
New Contributor III
1,622 Views

andysem wrote:

Please, specify that _mm_madd_epi16 and _mm256_madd_epi16 perform signed multiplication.

Was this forgotten? This information is still missing in 3.3.1.

 

0 Kudos
Patrick_K_Intel
Employee
1,622 Views

andysem wrote:

Was this forgotten? This information is still missing in 3.3.1.

I guess so, I'll be sure to include this in the next update.

0 Kudos
Yukimasa__Sugizaki
1,622 Views

Hi.


There are invalid names of constants in Operations in _mm512_{,mask_}extload_*.
(according to zmmintrin.h)

_MM_BROADCAST1X16 should be _MM_BROADCAST_1X16.
_MM_BROADCAST4X16 should be _MM_BROADCAST_4X16.
_MM_BROADCAST1X8 should be _MM_BROADCAST_1X8.
_MM_BROADCAST4X8 should be _MM_BROADCAST_4X8.


Regards,
Sugizaki.

0 Kudos
andysem
New Contributor III
1,622 Views

Please, mention in the description that _mm_maskmoveu_si128 and _mm_maskmove_si64 generate non-temporal memory stores.

0 Kudos
Patrick_K_Intel
Employee
1,593 Views

Thanks guys, I've made these corrections.

0 Kudos
bronxzv
New Contributor II
1,593 Views

there is a series of errors in the Intrinsics Guide for the description of intrinsics mapping to instructions with an immediate operand

operands of the imm8 type (8-bit) are declared as int (32-bit) intrinsic arguments so I'll advise to always use a notation such as imm[7:0] in the Intrinsics Guide

for example the description of _mm256_blend_epi16 at the moment makes some users think that they can use a 16-bit mask

(see https://software.intel.com/en-us/forums/topic/537849

0 Kudos
Patrick_K_Intel
Employee
1,593 Views

Thanks for reporting this issue. I have updated the documentation around immediate parameters to clarify this better.

0 Kudos
bronxzv
New Contributor II
1,593 Views

Patrick Konsor (Intel) wrote:
I have updated the documentation around immediate parameters to clarify this better.

the desciption for _mm256_blend_epi16 looks the same as before in the online Intrinsics Guide, I suppose that your changes aren't yet published, right ?

0 Kudos
Reply