Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
1135 ディスカッション

Bugs in Intrinsics Guide

andysem
新規コントリビューター III
62,186件の閲覧回数

Hi,

I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):

1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.

2. __m256 _mm256_undefined_si256 () should return __m256i.

3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.

4. _mm_alignr_epi8 has two descriptions.

5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?

I didn't read all instructions so there may be more issues. I'll post if I find anything else.

PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.

0 件の賞賛
221 返答(返信)
andysem
新規コントリビューター III
3,969件の閲覧回数

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

Bernard
高評価コントリビューター I
3,969件の閲覧回数

andysem wrote:

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

Works for me also.

Patrick_K_Intel
従業員
3,969件の閲覧回数

Sorry about that, there were some server changes that caused some intermittent issues, but it should be working fine now.

Xiong_Z_
ビギナー
3,969件の閲覧回数

description for __m128i _mm_sad_epu8 (__m128i a__m128i bis not correct,  

 

Description

Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned two unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of 64-bit elements in dst.
Kevin_D_Intel
従業員
3,969件の閲覧回数

User Jeremias M. wrote here: https://software.intel.com/en-us/forums/topic/516476#comment-1791398 regarding an issue filtering results only for KNC and the search returning _mm512_mask_set1_epi32 as a valid intrinsic for KNC. That is not currently incorrect. It may become true in a future release as discussed in the cited thread.

Patrick_K_Intel
従業員
3,969件の閲覧回数

Thank you for your feedback. I've updated the Intrinsics Guide to resolve the issues with _mm_sad_epu8 and _mm512_mask_set1_epi32, as well as a few other issues with KNC intrinsics.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/

Jeremias_M_
ビギナー
3,969件の閲覧回数

Hi,

I was using the function _mm512_mask_reduce_gmax_pd and when I checked for the int same functions in the guide, appeared only for AVX-512 instructions.

So, I checked in zmmintrin.h header and I saw the functions implemented. Then I tested some functions( _mm512_mask_reduce_max_epi32 (__mmask16 k, __m512i a), _mm512_reduce_max_epi32 (__m512i a) ), and they worked.


I believe that it's possible the below functions were made  for KNC too.

int _mm512_reduce_max_epi32 (__m512i a)

__int64 _mm512_reduce_max_epi64 (__m512i a)

unsigned int _mm512_reduce_max_epu32 (__m512i a)

unsigned __int64 _mm512_reduce_max_epu64 (__m512i a)

double _mm512_reduce_max_pd (__m512d a)

float _mm512_reduce_max_ps (__m512 a)

 

Patrick_K_Intel
従業員
3,969件の閲覧回数

You are correct, all the _reduce_ intrinsics are supported on KNC. I've updated the Intrinsics Guide to resolve this issue.

andysem
新規コントリビューター III
3,969件の閲覧回数

_mm_test_all_ones intrinsic has multiple different timing values for the same CPUs.

andysem
新規コントリビューター III
3,969件の閲覧回数

The Intel intrinsics guide page doesn't load for me or loads really slow (about a minute or so). It shows the intrinsics categories on the left and "Loading" in the center and hangs this way. I'm using Firefox 32.0.3 on Linux.

On a related note, will there be an offline standalone release? Browser version is not always convenient for me.

 

Dobratz__Glenn
初心者
3,969件の閲覧回数
I find the opening screen of the guide to be very unreadable. It would be much more readable if only the function name were used at the top level instead of the full function prototypes. Using the prototypes just creates a lot of visual noise that obscures the function names. Since the prototype is easily visible when a function is displayed, IMHO, the extra click needed to see the prototype is outweighed by the improved readability.
Dobratz__Glenn
初心者
3,969件の閲覧回数
It would be helpful if the description of the intrinsics also had a link to the corresponding instruction's description in the Intel Processor Instruction Set manual, so we can easily get the dirty details on the generated instruction.
andysem
新規コントリビューター III
3,969件の閲覧回数

Glenn D. wrote:

It would be much more readable if only the function name were used at the top level instead of the full function prototypes.

I disagree. The prototype is useful for me because I often don't remember the exact signature or arguments of the intrinsic, and all I have to do is just type it in the search field.

 

andysem
新規コントリビューター III
3,969件の閲覧回数

andysem wrote:

Please, specify that _mm_madd_epi16 and _mm256_madd_epi16 perform signed multiplication.

Was this forgotten? This information is still missing in 3.3.1.

 

Patrick_K_Intel
従業員
3,969件の閲覧回数

andysem wrote:

Was this forgotten? This information is still missing in 3.3.1.

I guess so, I'll be sure to include this in the next update.

Yukimasa__Sugizaki
ビギナー
3,969件の閲覧回数

Hi.


There are invalid names of constants in Operations in _mm512_{,mask_}extload_*.
(according to zmmintrin.h)

_MM_BROADCAST1X16 should be _MM_BROADCAST_1X16.
_MM_BROADCAST4X16 should be _MM_BROADCAST_4X16.
_MM_BROADCAST1X8 should be _MM_BROADCAST_1X8.
_MM_BROADCAST4X8 should be _MM_BROADCAST_4X8.


Regards,
Sugizaki.

andysem
新規コントリビューター III
3,969件の閲覧回数

Please, mention in the description that _mm_maskmoveu_si128 and _mm_maskmove_si64 generate non-temporal memory stores.

Patrick_K_Intel
従業員
3,940件の閲覧回数

Thanks guys, I've made these corrections.

bronxzv
新規コントリビューター II
3,940件の閲覧回数

there is a series of errors in the Intrinsics Guide for the description of intrinsics mapping to instructions with an immediate operand

operands of the imm8 type (8-bit) are declared as int (32-bit) intrinsic arguments so I'll advise to always use a notation such as imm[7:0] in the Intrinsics Guide

for example the description of _mm256_blend_epi16 at the moment makes some users think that they can use a 16-bit mask

(see https://software.intel.com/en-us/forums/topic/537849

Patrick_K_Intel
従業員
3,940件の閲覧回数

Thanks for reporting this issue. I have updated the documentation around immediate parameters to clarify this better.

bronxzv
新規コントリビューター II
3,940件の閲覧回数

Patrick Konsor (Intel) wrote:
I have updated the documentation around immediate parameters to clarify this better.

the desciption for _mm256_blend_epi16 looks the same as before in the online Intrinsics Guide, I suppose that your changes aren't yet published, right ?

返信