Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
公告
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Bugs in Intrinsics Guide

andysem
新分销商 III
64,164 次查看

Hi,

I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):

1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.

2. __m256 _mm256_undefined_si256 () should return __m256i.

3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.

4. _mm_alignr_epi8 has two descriptions.

5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?

I didn't read all instructions so there may be more issues. I'll post if I find anything else.

PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.

0 项奖励
221 回复数
andysem
新分销商 III
4,115 次查看

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

0 项奖励
Bernard
重要分销商 I
4,115 次查看

andysem wrote:

http://software.intel.com/sites/landingpage/IntrinsicsGuide/ works for me now. That said there were site outages a few days ago (the forum was completely inaccessible for a day or two for me), maybe the problems are still happening from time to time.

Works for me also.

0 项奖励
Patrick_K_Intel
4,115 次查看

Sorry about that, there were some server changes that caused some intermittent issues, but it should be working fine now.

0 项奖励
Xiong_Z_
初学者
4,115 次查看

description for __m128i _mm_sad_epu8 (__m128i a__m128i bis not correct,  

 

Description

Compute the absolute differences of packed unsigned 8-bit integers in a and b, then horizontally sum each consecutive 8 differences to produce four unsigned two unsigned 16-bit integers, and pack these unsigned 16-bit integers in the low 16 bits of 64-bit elements in dst.
0 项奖励
Kevin_D_Intel
员工
4,115 次查看

User Jeremias M. wrote here: https://software.intel.com/en-us/forums/topic/516476#comment-1791398 regarding an issue filtering results only for KNC and the search returning _mm512_mask_set1_epi32 as a valid intrinsic for KNC. That is not currently incorrect. It may become true in a future release as discussed in the cited thread.

0 项奖励
Patrick_K_Intel
4,115 次查看

Thank you for your feedback. I've updated the Intrinsics Guide to resolve the issues with _mm_sad_epu8 and _mm512_mask_set1_epi32, as well as a few other issues with KNC intrinsics.

https://software.intel.com/sites/landingpage/IntrinsicsGuide/

0 项奖励
Jeremias_M_
初学者
4,115 次查看

Hi,

I was using the function _mm512_mask_reduce_gmax_pd and when I checked for the int same functions in the guide, appeared only for AVX-512 instructions.

So, I checked in zmmintrin.h header and I saw the functions implemented. Then I tested some functions( _mm512_mask_reduce_max_epi32 (__mmask16 k, __m512i a), _mm512_reduce_max_epi32 (__m512i a) ), and they worked.


I believe that it's possible the below functions were made  for KNC too.

int _mm512_reduce_max_epi32 (__m512i a)

__int64 _mm512_reduce_max_epi64 (__m512i a)

unsigned int _mm512_reduce_max_epu32 (__m512i a)

unsigned __int64 _mm512_reduce_max_epu64 (__m512i a)

double _mm512_reduce_max_pd (__m512d a)

float _mm512_reduce_max_ps (__m512 a)

 

0 项奖励
Patrick_K_Intel
4,115 次查看

You are correct, all the _reduce_ intrinsics are supported on KNC. I've updated the Intrinsics Guide to resolve this issue.

0 项奖励
andysem
新分销商 III
4,115 次查看

_mm_test_all_ones intrinsic has multiple different timing values for the same CPUs.

0 项奖励
andysem
新分销商 III
4,115 次查看

The Intel intrinsics guide page doesn't load for me or loads really slow (about a minute or so). It shows the intrinsics categories on the left and "Loading" in the center and hangs this way. I'm using Firefox 32.0.3 on Linux.

On a related note, will there be an offline standalone release? Browser version is not always convenient for me.

 

0 项奖励
Dobratz__Glenn
4,115 次查看
I find the opening screen of the guide to be very unreadable. It would be much more readable if only the function name were used at the top level instead of the full function prototypes. Using the prototypes just creates a lot of visual noise that obscures the function names. Since the prototype is easily visible when a function is displayed, IMHO, the extra click needed to see the prototype is outweighed by the improved readability.
0 项奖励
Dobratz__Glenn
4,115 次查看
It would be helpful if the description of the intrinsics also had a link to the corresponding instruction's description in the Intel Processor Instruction Set manual, so we can easily get the dirty details on the generated instruction.
0 项奖励
andysem
新分销商 III
4,115 次查看

Glenn D. wrote:

It would be much more readable if only the function name were used at the top level instead of the full function prototypes.

I disagree. The prototype is useful for me because I often don't remember the exact signature or arguments of the intrinsic, and all I have to do is just type it in the search field.

 

0 项奖励
andysem
新分销商 III
4,115 次查看

andysem wrote:

Please, specify that _mm_madd_epi16 and _mm256_madd_epi16 perform signed multiplication.

Was this forgotten? This information is still missing in 3.3.1.

 

0 项奖励
Patrick_K_Intel
4,115 次查看

andysem wrote:

Was this forgotten? This information is still missing in 3.3.1.

I guess so, I'll be sure to include this in the next update.

0 项奖励
Yukimasa__Sugizaki
初学者
4,115 次查看

Hi.


There are invalid names of constants in Operations in _mm512_{,mask_}extload_*.
(according to zmmintrin.h)

_MM_BROADCAST1X16 should be _MM_BROADCAST_1X16.
_MM_BROADCAST4X16 should be _MM_BROADCAST_4X16.
_MM_BROADCAST1X8 should be _MM_BROADCAST_1X8.
_MM_BROADCAST4X8 should be _MM_BROADCAST_4X8.


Regards,
Sugizaki.

0 项奖励
andysem
新分销商 III
4,115 次查看

Please, mention in the description that _mm_maskmoveu_si128 and _mm_maskmove_si64 generate non-temporal memory stores.

0 项奖励
Patrick_K_Intel
4,086 次查看

Thanks guys, I've made these corrections.

0 项奖励
bronxzv
新分销商 II
4,086 次查看

there is a series of errors in the Intrinsics Guide for the description of intrinsics mapping to instructions with an immediate operand

operands of the imm8 type (8-bit) are declared as int (32-bit) intrinsic arguments so I'll advise to always use a notation such as imm[7:0] in the Intrinsics Guide

for example the description of _mm256_blend_epi16 at the moment makes some users think that they can use a 16-bit mask

(see https://software.intel.com/en-us/forums/topic/537849

0 项奖励
Patrick_K_Intel
4,086 次查看

Thanks for reporting this issue. I have updated the documentation around immediate parameters to clarify this better.

0 项奖励
bronxzv
新分销商 II
4,086 次查看

Patrick Konsor (Intel) wrote:
I have updated the documentation around immediate parameters to clarify this better.

the desciption for _mm256_blend_epi16 looks the same as before in the online Intrinsics Guide, I suppose that your changes aren't yet published, right ?

0 项奖励
回复