Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Contributor III
761 Views

Bugs in Intrinsics Guide

Hi,

I've found a few bugs in the Intel Intrinsics Guide 2.7 (I'm using Linux version):

1. When the window is maximized, the search field is stretched vertically while still being a one-line edit box. It sould probably be sized accordingly.

2. __m256 _mm256_undefined_si256 () should return __m256i.

3. In some instructions description, like _mm_adds_epi8, the operation is described in terms of SignedSaturate while, e.g. _mm256_adds_epi16 is described with SaturateToSignedWord. This applies to other operations with unsigned saturation as well. Also, the vector elements are described differently. More consistent description would be nice.

4. _mm_alignr_epi8 has two descriptions.

5. I'm not sure _mm_ceil_pd signature and description is correct. It says the intrinsic returns a vector of single-precision floats. Shouldn't it be double-precision?

I didn't read all instructions so there may be more issues. I'll post if I find anything else.

PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to newer instructions but still this info is useful and I hope it will be added.

0 Kudos
215 Replies
Highlighted
Valued Contributor II
297 Views

Thanks for the feedback! It would be nice to duplicate these errors online on doc-html-pages where you found issues or problems. As far as I know there is a special button to provide a feedback. >>... >>PS: This is not a bug per se but some instructions are missing the Latency & Throughput information. This mostly relates to >>newer instructions... This is a known issue and was addressed several times during last a couple of months. Even some older instructions are missing, unfortunately. Best regards, Sergey
0 Kudos
Highlighted
297 Views

Thanks for the feedback, most of this will be addressed in the next release.

1. I'm not able to replicate this issue with maximizing the window on Linux. What distro are you using? What version of Java?

2. This will be resolved in the next release.

3. All the descriptions and operations have been updated for the next release, so they should now be much more consistent.

4. This will be resolved in the next release.

5. This will be resolved in the next release.

I have not added any additional latency and throughput data yet, but I may get to this soon.

0 Kudos
Highlighted
Valued Contributor II
297 Views

>>...I have not added any additional latency and throughput data yet, but I may get to this soon. Thanks for the update and please keep everybody informed!
0 Kudos
Highlighted
New Contributor III
297 Views

@Sergey Kostrov

> It would be nice to duplicate these errors online on doc-html-pages where you found issues or problems. As far as I know there is a special button to provide a feedback.

I don't quite understand what pages do you mean. Could you provide a link.

@Patrick Konsor

> 1. I'm not able to replicate this issue with maximizing the window on Linux. What distro are you using? What version of Java?

I'm seeing this on Kubuntu 12.04 and 12.10, both x86-64, KDE 4.9.5 whil dual monitors attached. I'm using Oracle Java:

java version "1.7.0_11"
Java(TM) SE Runtime Environment (build 1.7.0_11-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.6-b04, mixed mode)

I've attached a screenshot to illustrate the problem.

0 Kudos
Highlighted
New Contributor III
297 Views

A new pack of bugs:

1. _mm_cvtss_f32 is described to be equivalent to cvtss2si instruction. I suppose, the intrinsic should not generate any instructions, if the compiler uses SSE for math calculations or should simply store the value to some memory or general purpose register. But it sould not convert the float value to an integer.

2.  _mm_cvtsi32_si128 is said to extend the upper bits of the operand in the description, but it should extend it with zeros.

0 Kudos
Highlighted
297 Views

Version 2.8 has been released:
http://software.intel.com/en-us/articles/intel-intrinsics-guide

Note that this release does include additional latency and throughput data.

Regarding the two new issues:
1. You're correct, cvtss2si is the wrong instruction. movss is the official instruction, although you'll often see different instructions based on context. This will be resolved in the next release.
2. This issue was already resolved in v2.8.

Are you still seeing the issue with the search box expanding on Linux with v2.8? 

0 Kudos
Highlighted
New Contributor III
297 Views

Thanks for the updated release.

Yes, the problem with the search box is still present. I must say, it wasn't present before 2.7 (I think, that version introduced some interface changes; aside the search field, I think, the fonts also changed).

0 Kudos
Highlighted
Valued Contributor II
297 Views

>>Version 2.8 has been released: >>software.intel.com/en-us/articles/intel-intrinsics-guide >> >>Note that this release does include additional latency and throughput data. Thank you, Patrick.
0 Kudos
Highlighted
Beginner
297 Views

This release is great!

Now there are latency and throughput data for Ivy Bridge, too!

I waited for this quite some time. One always had to look in the really big manuals to find that sort of information.

0 Kudos
Highlighted
New Contributor III
297 Views

One additional bug: _mm_max_epu32 signature contains three arguments: __m128i _mm_max_epu32 (__m128i a, __m128i b, __m128i b). I believe, the last one should be removed.

0 Kudos
Highlighted
Valued Contributor II
297 Views

Yes. That is correct and here is a declaration from smmintrin.h header file ( Intel version ): ... extern __m128i __ICL_INTRINCC _mm_max_epu32( __m128i, __m128i ); ...
0 Kudos
Highlighted
New Contributor III
297 Views

__int _mm256_movemask_epi8 (__m256i a)

Please, remove the leading underscores in the return type.

0 Kudos
Highlighted
297 Views

Thanks, this issue will be fixed in the next release.

0 Kudos
Highlighted
Valued Contributor II
297 Views

>>...__int _mm256_movemask_epi8 (__m256i a) Here is a declaration from immintrin.h header file ( Intel version ): ... /* * Returns a 32-bit mask made up of the most significant bit of each byte * of the 256-bit vector source operand. */ extern int __ICL_INTRINCC _mm256_movemask_epi8(__m256i); ...
0 Kudos
Highlighted
New Contributor III
297 Views

The description of the  _mm256_shuffle_epi8 intrinsic looks like it acts cross-lane. And its formal algorithm doesn't clarify that because its index value is [0..15] bounded, and it is not adjusted for the second lane (this would result in lane 0 of a being distributed to both lanes of b).

0 Kudos
Highlighted
New Contributor III
297 Views

Just noted that 2.8.1 has been released. Thanks for the update.

_mm256_shuffle_epi8 description is still confusing. And the original issue with the search bar is not fixed too. I somehow forgot to mention that the problem shows not only with maximized window, but also with normal window larger than a certain size vertically. I suppose, the field size is ok when the window height is less or equal to the total height of all widgets, and when it exceeds it the search field is stretched instead of adding unused space in the bottom. Is there any estimate for the fix?

0 Kudos
Highlighted
Valued Contributor II
297 Views

>>Just noted that 2.8.1 has been released... Here is a link to download a recently released Intel Intrinsics Guide for Windows verion 2.8.1: software.intel.com/sites/default/files/Intel_Intrinsics_Guide-windows-v2.8.1.zip
0 Kudos
Highlighted
297 Views

You're correct about _mm256_shuffle_epi8, it is not a cross lane operation, I will fix the description and operation in the next release. Regarding the search bar issue, I have not been able to reproduce this on Ubuntu.

0 Kudos
Highlighted
New Contributor III
297 Views

> Regarding the search bar issue, I have not been able to reproduce this on Ubuntu.

Hmm, I can reproduce it on all 3 of my systems, with Nvidia and AMD graphics and different drivers, on Kubuntu from 12.04 to 13.04. I'm using Oracle Java 1.7.

I have quite large displays though - 2560x1440 on two of my machines and 1920x1200 on another laptop. I'm not sure that a 1920x1080 display is big enough for the problem to manifest itself as this height will be filled with widgets. If you don't have access to a bigger display you can try to attach a second display and arrange it to be below your main display and stretch the window vertically. Or you can do the same with a single display if you move the window to the lower side of the screen (so that the window goes partially below the edge) and then resize the window vertically by dragging its top edge upwards.

0 Kudos