Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

How to assign a constant?

CLi37
Beginner
4,496 Views

I do not think there are constant registers in x86. When I define a const array, x86 access these constants from a memory but not a direct constant in instruction. Any instructions can assign a 128bit/256bit constant to a SSE/AVX register? 

0 Kudos
46 Replies
Thomas_W_Intel
Employee
338 Views

Chang-li,

I understand what you are trying to achieve, but I fear that you would not gain much. Assuming there was an load instruction for YMM registers with an immediate, the enconding would be longer than 32 bytes. This would result in some major hick-ups in the the core. For example, the loop-stream detector processes the instructions in 32-byte chunks. Therefore, your instruction wouldn't even fit in one chunk!

On the other hand you have two load ports and can do up to two loads per cycles. Reading a constant from memory can be pipelined nicely with other loads as there are no dependencies. When you are absolutely limited by the number of loads, keeping at least some of the constants in a register might help as a last resort.

Kind regards

Thomas

0 Kudos
Bernard
Valued Contributor I
338 Views

>>>On the other hand you have two load ports and can do up to two loads per cycles.>>>

Will it stay the same on Haswell architecture?I mean load/store ports

0 Kudos
CLi37
Beginner
338 Views

Thomas Willhalm (Intel) wrote:

Chang-li,

I understand what you are trying to achieve, but I fear that you would not gain much. Assuming there was an load instruction for YMM registers with an immediate, the enconding would be longer than 32 bytes. This would result in some major hick-ups in the the core. For example, the loop-stream detector processes the instructions in 32-byte chunks. Therefore, your instruction wouldn't even fit in one chunk!

On the other hand you have two load ports and can do up to two loads per cycles. Reading a constant from memory can be pipelined nicely with other loads as there are no dependencies. When you are absolutely limited by the number of loads, keeping at least some of the constants in a register might help as a last resort.

Kind regards

Thomas

It is true for YMM* that is 256-bit (32 bytes). But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk.

Chang

0 Kudos
SergeyKostrov
Valued Contributor II
338 Views
>>...But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk... What about throughput of instructions? For example, in case of a General Purpose MOV instruction it is 3 instructions in one clock cycle. Take a look at Intel Optimization Reference for more information.
0 Kudos
CLi37
Beginner
338 Views

Sergey Kostrov wrote:

>>...But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk...

What about throughput of instructions? For example, in case of a General Purpose MOV instruction it is 3 instructions in one clock cycle. Take a look at Intel Optimization Reference for more information.

There is no XMM* direct constant assign instruction yet.  

0 Kudos
Reply