Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

How to assign a constant?

CLi37
Начинающий
8 860Просмотр.

I do not think there are constant registers in x86. When I define a const array, x86 access these constants from a memory but not a direct constant in instruction. Any instructions can assign a 128bit/256bit constant to a SSE/AVX register? 

0 баллов
46 Ответы
Thomas_W_Intel
Сотрудник
757Просмотр.

Chang-li,

I understand what you are trying to achieve, but I fear that you would not gain much. Assuming there was an load instruction for YMM registers with an immediate, the enconding would be longer than 32 bytes. This would result in some major hick-ups in the the core. For example, the loop-stream detector processes the instructions in 32-byte chunks. Therefore, your instruction wouldn't even fit in one chunk!

On the other hand you have two load ports and can do up to two loads per cycles. Reading a constant from memory can be pipelined nicely with other loads as there are no dependencies. When you are absolutely limited by the number of loads, keeping at least some of the constants in a register might help as a last resort.

Kind regards

Thomas

Bernard
Ценный участник I
757Просмотр.

>>>On the other hand you have two load ports and can do up to two loads per cycles.>>>

Will it stay the same on Haswell architecture?I mean load/store ports

CLi37
Начинающий
757Просмотр.

Thomas Willhalm (Intel) wrote:

Chang-li,

I understand what you are trying to achieve, but I fear that you would not gain much. Assuming there was an load instruction for YMM registers with an immediate, the enconding would be longer than 32 bytes. This would result in some major hick-ups in the the core. For example, the loop-stream detector processes the instructions in 32-byte chunks. Therefore, your instruction wouldn't even fit in one chunk!

On the other hand you have two load ports and can do up to two loads per cycles. Reading a constant from memory can be pipelined nicely with other loads as there are no dependencies. When you are absolutely limited by the number of loads, keeping at least some of the constants in a register might help as a last resort.

Kind regards

Thomas

It is true for YMM* that is 256-bit (32 bytes). But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk.

Chang

SergeyKostrov
Ценный участник II
757Просмотр.
>>...But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk... What about throughput of instructions? For example, in case of a General Purpose MOV instruction it is 3 instructions in one clock cycle. Take a look at Intel Optimization Reference for more information.
CLi37
Начинающий
757Просмотр.

Sergey Kostrov wrote:

>>...But XMM* is 128-bit (16 bytes) that a direct constant instruction can be in one chunk...

What about throughput of instructions? For example, in case of a General Purpose MOV instruction it is 3 instructions in one clock cycle. Take a look at Intel Optimization Reference for more information.

There is no XMM* direct constant assign instruction yet.  

Ответить