- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Dear Team,
I would like to report on some disparity between some instruction specification as documented in the Intel 64 and IA-32 Architectures Software Developer's Manual, Vol 2 and the actual execution behaviour.
Bug Report 1: vpsravd %xmm3, %xmm2, %xmm1
Semantics as per the above manual:
%ymm1 : 0x0₁₂₈ ∘ ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) ∘ ((%ymm2[95:64] sign_shift_right %ymm3[95:64]) ∘ ((%ymm2[63:32] sign_shift_right %ymm3[63:32]) ∘ (%ymm2[31:0] sign_shift_right %ymm3[31:0]))))
** ∘ is the concatenate symbol here.
Note that the first term ((%ymm2[127:96] sign_shift_right (0x0₂₇ ∘ %ymm3[100:96])) has only 5 bits selected from '%ymm3'.
But the actual execution behaviour seem to expect 32 bits from %ymm3, i.e., ((%ymm2[127:96] sign_shift_right 0x0₂₇ ∘ %ymm3[127:96])
The following is the pseudo code from manual
VPSRAVD (VEX.128 version) COUNT_0 = SRC2[31 : 0] (* Repeat Each COUNT_i for the 2nd through 4th dwords of SRC2*) COUNT_3 = SRC2[100 : 96]; //<------------------------------------- Possibly a bug DEST[31:0] = SignExtend(SRC1[31:0] >> COUNT_0); (* Repeat shift operation for 2nd through 4th dwords *) DEST[127:96] = SignExtend(SRC1[127:96] >> COUNT_3); DEST[MAXVL-1:128] = 0;
I am expecting the above bold portion to be a bug and should be SRC2[127 : 96]
Test Input (in Hex):
%ymm2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - **80 00 00 00** - 00 00 00 00 - 00 00 00 00 - 00 00 00 00
%ymm3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - **00 00 00 20** - 00 00 00 00 - 00 00 00 00 - 00 00 00 00
As per the manual, we should select just 5 bits from `00 00 00 20`, where as hardware execution semantics require all 32 bits.
Output as per manual:
0x0₁₂₈ ∘ ((0x80000000₃₂ sign_shift_right 0x0₃₂) ∘ ((0x0₃₂ sign_shift_right 0x0₃₂) ∘ ((0x0₃₂ sign_shift_right 0x0₃₂) ∘ (0x0₃₂ sign_shift_right 0x0₃₂))))
Output as per actual Intel hardware (Intel(R) Xeon(R) CPU E3-1505M):
0x0₆₄ ∘ 0x0₆₄ ∘ 0xffffffff00000000₆₄ ∘ 0x0₆₄
The same probable typo appears in the pseudocode for these instructions:
VPSLLVD (VEX.128 version)
VPSLLVD (VEX.256 version)
VPSLLVQ (VEX.256 version)
VPSRAVD (VEX.256 version)
Also, there seems to be a typo in the description text of VPSRAVW/VPSRAVD/VPSRAVQ. There are two paragraphs starting with "The count values..."; the second one should be deleted.
Bug Report 2: packsswb
There seems to be bug in the descriptive text
If the signed doubleword value is beyond the range of an unsigned word (i.e. greater than 7FH or less than 80H), ...
In my opinion, the description must say range of signed word insead.
- Etiquetas:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Thx. there was more going on in the 2nd example that needs fixing. I'll pass these along to our documentation person. Thanks for reporting the issues.
Enlace copiado
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Thx. there was more going on in the 2nd example that needs fixing. I'll pass these along to our documentation person. Thanks for reporting the issues.
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Thanks Mark! Really apreciate your reply.
Do you mind sharing what else is going wrong with the second example ? That might help my current work.
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
the 2nd sentence of that paragraph talks about doublewords and words when it should be talking about words and bytes. The operation section makes it a little more clear.
- Marcar como nuevo
- Favorito
- Suscribir
- Silenciar
- Suscribirse a un feed RSS
- Resaltar
- Imprimir
- Informe de contenido inapropiado
Indeed. Thanks again.

- Suscribirse a un feed RSS
- Marcar tema como nuevo
- Marcar tema como leído
- Flotar este Tema para el usuario actual
- Favorito
- Suscribir
- Página de impresión sencilla