Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Add Instruction PROTD

sc00bz
Beginner
197 Views

A bit rotate is very common in cryptography. If you need to encrypt or decrypt data fast and you chose a method to be able to encrypt and/or decrypt data out of order. Which is the case for RC5, RC6, Serpent, and others with these methods ECB, CBC (decryption only), CFB (decryption only), and CTR/ICM/SIC. Also used in MD4, MD5, SHA-1, SHA-2 family.

Adding PROTD will save one register and two instructions:

SSE2:
MOVDQA   XMM1,XMM0
PSLLD    XMM0,1
PSRLD    XMM1,32-1
POR      XMM0,XMM1

SSE5:
PROTD    XMM0,XMM0,1

AVX:
VPSLLD   XMM1,XMM0,1
VPSRLD   XMM0,XMM0,32-1
POR      XMM0,XMM0,XMM1
0 Kudos
2 Replies
Mark_B_Intel1
Employee
197 Views
Quoting - sc00bz

A bit rotate is very common in cryptography. If you need to encrypt or decrypt data fast and you chose a method to be able to encrypt and/or decrypt data out of order. Which is the case for RC5, RC6, Serpent, and others with these methods ECB, CBC (decryption only), CFB (decryption only), and CTR/ICM/SIC. Also used in MD4, MD5, SHA-1, SHA-2 family.

Adding PROTD will save one register and two instructions:

SSE2:
MOVDQA   XMM1,XMM0
PSLLD    XMM0,1
PSRLD    XMM1,32-1
POR      XMM0,XMM1

SSE5:
PROTD    XMM0,XMM0,1

AVX:
VPSLLD   XMM1,XMM0,1
VPSRLD   XMM0,XMM0,32-1
POR      XMM0,XMM0,XMM1

I agree but can you post the rest of your loop so we can figure out if the overall loop gets faster (+ by how much)?

Regards,

Mark Buxton

ILevi1
Valued Contributor I
197 Views
Quoting - mjbuxton

I agree but can you post the rest of your loop so we can figure out if the overall loop gets faster (+ by how much)?

Regards,

Mark Buxton

It would also be interesting to have an instruction to extract Carry from each of the DWORDs.

Reply