- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Team,
I would like to report on some disparity between some instruction specification as documented in the Intel 64 and IA-32 Architectures Software Developer's Manual, Vol 2 and the actual execution behaviour.
Bug Report 1: vpsravd %xmm3, %xmm2, %xmm1
Semantics as per the above manual:
%ymm1 : 0x0₁₂₈ ∘ ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) ∘ ((%ymm2[95:64] sign_shift_right %ymm3[95:64]) ∘ ((%ymm2[63:32] sign_shift_right %ymm3[63:32]) ∘ (%ymm2[31:0] sign_shift_right %ymm3[31:0]))))
** ∘ is the concatenate symbol here.
Note that the first term ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) has only 5 bits selected from '%ymm3'.
But the actual execution behaviour seem to expect 32 bits from %ymm3, i.e., ((%ymm2[127:96] sign_shift_right 0x0₂₇ ∘ %ymm3[127:96])
The following is the pseudo code from manual
VPSRAVD (VEX.128 version) COUNT_0 = SRC2[31 : 0] (* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*) COUNT_3 = SRC2[100 : 96]; //<------------------------------------- Possibly a bug DEST[31:0] = SignExtend(SRC1[31:0] >> COUNT_0); (* Repeat shift operation for 2nd through 4th dwords *) DEST[127:96] = SignExtend(SRC1[127:96] >> COUNT_3); DEST[MAXVL-1:128] = 0;
I am expecting the above bold portion to be a bug and should be SRC2[127 : 96]
Test Input (in Hex):
%ymm2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - **80 00 00 00** - 00 00 00 00 - 00 00 00 00 - 00 00 00 00
%ymm3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - **00 00 00 20** - 00 00 00 00 - 00 00 00 00 - 00 00 00 00
As per the manual, we should select just 5 bits from `00 00 00 20`, where as hardware execution semantics require all 32 bits.
Output as per manual:
0x0₁₂₈ ∘ ((0x80000000₃₂ sign_shift_right 0x0₃₂) ∘ ((0x0₃₂ sign_shift_right 0x0₃₂) ∘ ((0x0₃₂ sign_shift_right 0x0₃₂) ∘ (0x0₃₂ sign_shift_right 0x0₃₂))))
Output as per actual Intel hardware (Intel(R) Xeon(R) CPU E3-1505M):
0x0₆₄ ∘ 0x0₆₄ ∘ 0xffffffff00000000₆₄ ∘ 0x0₆₄
The same probable typo appears in the pseudocode for these instructions:
VPSLLVD (VEX.128 version)
VPSLLVD (VEX.256 version)
VPSLLVQ (VEX.256 version)
VPSRAVD (VEX.256 version)
Also, there seems to be a typo in the description text of VPSRAVW/VPSRAVD/VPSRAVQ. There are two paragraphs starting with "The count values..."; the second one should be deleted.
Bug Report 2: packsswb
There seems to be bug in the descriptive text
If the signed doubleword value is beyond the range of an unsigned word (i.e. greater than 7FH or less than 80H), ...
In my opinion, the description must say range of signed word insead.
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thx. there was more going on in the 2nd example that needs fixing. I'll pass these along to our documentation person. Thanks for reporting the issues.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thx. there was more going on in the 2nd example that needs fixing. I'll pass these along to our documentation person. Thanks for reporting the issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Mark! Really apreciate your reply.
Do you mind sharing what else is going wrong with the second example ? That might help my current work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the 2nd sentence of that paragraph talks about doublewords and words when it should be talking about words and bytes. The operation section makes it a little more clear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed. Thanks again.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page