I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):
1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.
2. __m256 _mm256_undefined_si256 () should return __m256i.
3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.
4. _mm_alignr_epi8 has two descriptions.
5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?
I didn't read all instructions so there may be more issues. I'll post if I find anything else.
PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.
Still doesn't work today, it seems that no one is in charge.
Osiv, Oleksiy wrote:
Hey, the guide is not working at all today. I checked Chrome & Edge. Development console contains the following error:
Refused to execute script from 'https://software.intel.com/sites/landingpage/IntrinsicsGuide/files/perf....' because its MIME type ('application/json') is not executable, and strict MIME type checking is enabled.
Hello.... Intel developers...!
Hello, Intel developers,
What does multiplication of scaled index by constant 8 mean in description of _mm512_i64gather_ps (and similar gather/scatter functions)? As far as i can observe, actual behavior of such functions does not include the additional constant scaling.
In various gather operation descriptions for AVX2 and AVX-512, the scale is multiplied by 8 (`ZeroExtend64(scale) * 8`). This is incorrect, as the scale is the actual multiplier, without multiplying it by 8. This follows from SDM description of VPGATHERDD/DQ/QD/QQ.