Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

question on avx instruction encoding

tthsqe
Beginner
1,207 Views
so any instruction that can be encoded with a two byte vex prefix can be encoded with a three byte prefix. How about the other way around?
Is it true thatthat the two byte prefix may be usedif and only if
vex.W = 0
vex.X = 1
vex.B = 1
vex.mmmmm = 00001
?
I.E. VFMADD132PS may use the two byte form but VFMADD132PD needs the three byte form?
0 Kudos
9 Replies
MarkC_Intel
Moderator
1,207 Views
Quoting - tthsqe
so any instruction that can be encoded with a two byte vex prefix can be encoded with a three byte prefix. How about the other way around?
Is it true thatthat the two byte prefix may be usedif and only if
vex.W = 0
vex.X = 1
vex.B = 1
vex.mmmmm = 00001
?
I.E. VFMADD132PS may use the two byte form but VFMADD132PD needs the three byte form?

Hi,
The 3-byte VEX sequence (starting with C4) must be used when one needs to set VEX.W=1, VEX.X=0, VEX.B=0 or the opcodes are in map 0F3A or 0F38. The X and B bits are logically inverted bits relative to their meaning in the REX prefix. The 2-byte VEX sequence (staring with C5) can be used when the opcodes are in the 0F map and do not require these other bit settings.

For your 2nd question, all the VFMADD* instructions are in map 0F38, so they all must use the 3-byte (C4) VEX sequence.

An example of something that could use C4 or C5 is VADDPS. It is in map 0F. But which prefix sequence is required depends on the registers used. Here are two examples using the XED from the Intel SDE kit. In the first example, because the 3rd operand uses YMM13 and the operand is encoded in the MODRM.RM field, it uses the VEX.B'=0 to encode the upper bit of its register identifer. (REX.B would be 1 so VEX.B'=0). In the second example, since the 3rd operand is YMM3, we don't need to specify VEX.B'=0, and the shorter 2-byte C5 sequence can be used.

% kit/xed -64 -e vaddps ymm0 ymm1 ymm13
Request: VADDPS MODE:2, REG0:YMM0, REG1:YMM1, REG2:YMM13, SMODE:2
OPERAND ORDER: REG0 REG1 REG2
Encodable! C4C17458C5
.byte 0xc4,0xc1,0x74,0x58,0xc5

% kit/xed -64 -e vaddps ymm0 ymm1 ymm3
Request: VADDPS MODE:2, REG0:YMM0, REG1:YMM1, REG2:YMM3, SMODE:2
OPERAND ORDER: REG0 REG1 REG2
Encodable! C5F458C3
.byte 0xc5,0xf4,0x58,0xc3

0 Kudos
tthsqe
Beginner
1,207 Views

Hi,
The 3-byte VEX sequence (starting with C4) must be used when one needs to set VEX.W=1, VEX.X=0, VEX.B=0 or the opcodes are in map 0F3A or 0F38. The X and B bits are logically inverted bits relative to their meaning in the REX prefix. The 2-byte VEX sequence (staring with C5) can be used when the opcodes are in the 0F map and do not require these other bit settings.

For your 2nd question, all the VFMADD* instructions are in map 0F38, so they all must use the 3-byte (C4) VEX sequence.

An example of something that could use C4 or C5 is VADDPS. It is in map 0F. But which prefix sequence is required depends on the registers used. Here are two examples using the XED from the Intel SDE kit. In the first example, because the 3rd operand uses YMM13 and the operand is encoded in the MODRM.RM field, it uses the VEX.B'=0 to encode the upper bit of its register identifer. (REX.B would be 1 so VEX.B'=0). In the second example, since the 3rd operand is YMM3, we don't need to specify VEX.B'=0, and the shorter 2-byte C5 sequence can be used.

% kit/xed -64 -e vaddps ymm0 ymm1 ymm13
Request: VADDPS MODE:2, REG0:YMM0, REG1:YMM1, REG2:YMM13, SMODE:2
OPERAND ORDER: REG0 REG1 REG2
Encodable! C4C17458C5
.byte 0xc4,0xc1,0x74,0x58,0xc5

% kit/xed -64 -e vaddps ymm0 ymm1 ymm3
Request: VADDPS MODE:2, REG0:YMM0, REG1:YMM1, REG2:YMM3, SMODE:2
OPERAND ORDER: REG0 REG1 REG2
Encodable! C5F458C3
.byte 0xc5,0xf4,0x58,0xc3


oh, sorry. I over looked that fmadd doesn't have the 0f opcode. I might try that sde. Thanks.
0 Kudos
minipli41
Beginner
1,207 Views

% kit/xed -64 -e vaddps ymm0 ymm1 ymm13
Request: VADDPS MODE:2, REG0:YMM0, REG1:YMM1, REG2:YMM13, SMODE:2
OPERAND ORDER: REG0 REG1 REG2
Encodable! C4C17458C5
.byte 0xc4,0xc1,0x74,0x58,0xc5

I'm a little confused. Why is the second byte 0xc1? This would imply VEX.X = 1 but shouldn't it be 0 because REX.X for YMM13 would be 1?

And even more confusing:

% printf '\\xc4\\xc1\\x74\\x58\\xc5' > vaddps.bin
% printf '\\xc4\\x81\\x74\\x58\\xc5' >> vaddps.bin
% ./xed -64 -ir vaddps.bin
In raw...
XDIS 0: AVX       AVX   C4C17458C5               vaddps ymm0, ymm1, ymm13
XDIS 5: AVX       AVX   C4817458C5               vaddps ymm0, ymm1, ymm13

...

How can those two byte sequences decode to the same opcode? Shouldn't the former have REG2:YMM2 and only the latter REG2:YMM15?

Regards,

Mathias

0 Kudos
MarkC_Intel
Moderator
1,207 Views
C4 C1 74 58 C5 and C4 81 74 58 C5 differ in the VEX.X bit. The VEX.X bit is not used in the encoding of these forms of these instructions. VEX.X is typically used to extend the index-register operand if there is one.
0 Kudos
MarkC_Intel
Moderator
1,207 Views
Oh yeah, and in answer to the first part of your original question: VEX.X is stored inverted. (As are VEX.R and VEX.B). The reason for that has to do with how we re-used the LDS/LES instructions in 32b mode.
0 Kudos
minipli41
Beginner
1,207 Views
Thanks. I think I got it. Since the VEX prefix allows to encode a full YMM register without any further extension bits in the vvvv field the VEX.{R,B,X} bits are used for possible registers 2 to 4, right?
0 Kudos
MarkC_Intel
Moderator
1,207 Views
Hi. Not sure what you mean by "registers 2 to 4". Each instruction description now has a box on the instruction page that specifies where the operands are encoded. Different instructions take their operands from the available fields in slightly different orders.
Given that there are 16 xmm/ymm registers on 64b, we need to have 4 register specifier bits per register operand. You are correct that the VEX.VVVV field is self sufficient being 4b wide. In AVX, the other places that registers can be specified (MODRM.REG, MODRM.RM, SIB.BASE, SIB.INDEX) are 3b wide and thus all require another bit. The 4th register specifier bit comes from from the VEX.{R,X,B} fields, inverted. In SSE, the 4th bit came from the REX prefix fields.
0 Kudos
minipli41
Beginner
1,207 Views
Hi. Not sure what you mean by "registers 2 to 4". Each instruction description now has a box on the instruction page that specifies where the operands are encoded. Different instructions take their operands from the available fields in slightly different orders.

Oh, somehow I used to ignore that second box in the manual. I only used to look at the first one, describing the different encodings for one instruction. Thanks for making me look a little closer. Now it's clear to me how to encode the different instructions. :)

Given that there are 16 xmm/ymm registers on 64b, we need to have 4 register specifier bits per register operand. You are correct that the VEX.VVVV field is self sufficient being 4b wide. In AVX, the other places that registers can be specified (MODRM.REG, MODRM.RM, SIB.BASE, SIB.INDEX) are 3b wide and thus all require another bit. The 4th register specifier bit comes from from the VEX.{R,X,B} fields, inverted. In SSE, the 4th bit came from the REX prefix fields.

Yeah, that's what I meant with only needing 3 Bits (VES.{B,R,X}) to encode the missing bits for a maximum of four register arguments.

Thanks again! You helped me a lot!

0 Kudos
mariaosawa
Beginner
1,207 Views
Thank you for very interesting article. Please continue writting. These facts are amazing . I was searching for at least 5 weaks and i didn't get the perfect answer. But after all i found from your site. thanks for posting such a interesting topic.
0 Kudos
Reply