- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I wonder what is the behavior of the KUNPCKBW/KUNPCKWD/KUNPCKDQ instructions. In SDM, the description of the instructions imply that they interleave individual bits of the input registers. This is especially so for KUNPCKWD, for which the description wording is missing the work "masks".
At the same time, the pseudo-code of the operations indicate that the instructions move SRC1 bits above SRC2 without interleaving. Intel Intrinsics Guide also contains this pseudocode.
Based on my previous experience with unpack instructions in SSE and AVX, I would expect the KUNPCK* instructions to interleave individual bits, but in this case the pseudocode is incorrect. Is this the case? If not, it would be better to update the instructions description to make it clear that they do not interleave individual bits.
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I asked our doc person to improve the box descriptions. The pseudocode is accurate.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page