Community
cancel
Showing results for 
Search instead for 
Did you mean: 
sc00bz
Beginner
104 Views

Add Instruction PROTD

A bit rotate is very common in cryptography. If you need to encrypt or decrypt data fast and you chose a method to be able to encrypt and/or decrypt data out of order. Which is the case for RC5, RC6, Serpent, and others with these methods ECB, CBC (decryption only), CFB (decryption only), and CTR/ICM/SIC. Also used in MD4, MD5, SHA-1, SHA-2 family.

Adding PROTD will save one register and two instructions:

SSE2:
MOVDQA   XMM1,XMM0
PSLLD    XMM0,1
PSRLD    XMM1,32-1
POR      XMM0,XMM1

SSE5:
PROTD    XMM0,XMM0,1

AVX:
VPSLLD   XMM1,XMM0,1
VPSRLD   XMM0,XMM0,32-1
POR      XMM0,XMM0,XMM1
0 Kudos
2 Replies
Mark_B_Intel1
Employee
104 Views

Quoting - sc00bz

A bit rotate is very common in cryptography. If you need to encrypt or decrypt data fast and you chose a method to be able to encrypt and/or decrypt data out of order. Which is the case for RC5, RC6, Serpent, and others with these methods ECB, CBC (decryption only), CFB (decryption only), and CTR/ICM/SIC. Also used in MD4, MD5, SHA-1, SHA-2 family.

Adding PROTD will save one register and two instructions:

SSE2:
MOVDQA   XMM1,XMM0
PSLLD    XMM0,1
PSRLD    XMM1,32-1
POR      XMM0,XMM1

SSE5:
PROTD    XMM0,XMM0,1

AVX:
VPSLLD   XMM1,XMM0,1
VPSRLD   XMM0,XMM0,32-1
POR      XMM0,XMM0,XMM1

I agree but can you post the rest of your loop so we can figure out if the overall loop gets faster (+ by how much)?

Regards,

Mark Buxton

ILevi1
Valued Contributor I
104 Views

Quoting - mjbuxton

I agree but can you post the rest of your loop so we can figure out if the overall loop gets faster (+ by how much)?

Regards,

Mark Buxton

It would also be interesting to have an instruction to extract Carry from each of the DWORDs.

Reply