Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

reverse a PMOVMSKB instruction?

dattrax
Beginner
1,242 Views
I've been searching for days for the reverse of the PMOVMSKB instruction.

I want to collapse a 64bit result down to 8bits and then restore it again.

for example

0xFF00FF00FF00FF00 = (PMOVMSKB) 10101010b = 0xFF00FF00FF00FF00

Can anyone help. If I had hair I'd be pulling it out :-)
0 Kudos
4 Replies
capens__nicolas
New Contributor I
1,242 Views
Just use a lookup table (LUT). It only needs 256 entries. So the reverse of "pmovmskb eax, mm0" becomes a simple "movq mm0, [LUT+eax*8]".

Although a 2 kB lookup table isn't much, here's an alternative that requires no table in case you have very poor L1 cache hit ratios:

[plain]movd mm0, eax
punpcklbw mm0, mm0
pshufw mm0, mm0, 0x00
pand mm0, [mask8040201008040201h]
pcmpeb mm0, [mask8040201008040201h][/plain]
I hope this helps!

Nicolas
0 Kudos
dattrax
Beginner
1,242 Views
Quoting - c0d1f1ed
Just use a lookup table (LUT). It only needs 256 entries. So the reverse of "pmovmskb eax, mm0" becomes a simple "movq mm0, [LUT+eax*8]".

Although a 2 kB lookup table isn't much, here's an alternative that requires no table in case you have very poor L1 cache hit ratios:

[plain]movd mm0, eax
punpcklbw mm0, mm0
pshufw mm0, mm0, 0x00
pand mm0, [mask8040201008040201h]
pcmpeb mm0, [mask8040201008040201h][/plain]
I hope this helps!

Nicolas

Thanks for this. I had hoped for a single instruction to do it, but this combination is fine.

Thanks again, Jim
0 Kudos
capens__nicolas
New Contributor I
1,242 Views
Quoting - dattrax
Thanks for this. I had hoped for a single instruction to do it, but this combination is fine.

Thanks again, Jim

You're welcome. Note that the LUT method really isa single instruction solution. You can actually use the second method to fill in the table.
0 Kudos
levicki
Valued Contributor I
1,242 Views
In my opinion the code should restore just the sign bits to MM register unless of course you wanted to use 0xFF and 0x00 bytes as masks for AND or OR instructions later but I guess if that was the case it would be much easier just to shift right arithmetically by 7 bits (thus propagating sign bit) instead of using PMOVMSKB in the first place. Or is this some sort of "compression"?

0 Kudos
Reply