Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
1058 Discussions

Intrinsic guide 2.6 error in documentation

gilgil
Beginner
209 Views

In the documentation the intrinsic _mm_mulhrs_epi16 the shift right should be 15 and not 14.

0 Kudos
4 Replies
Patrick_K_Intel
Employee
209 Views
14 bits is correct. See the Instruction Set Reference in the Software Developer's Manual:http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

PMULHRSW (with 128-bit operand)

temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;
temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;
temp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;
temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;
temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;
temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;
temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;
DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];
DEST[47:32] = temp2[16:1];
DEST[63:48] = temp3[16:1];
DEST[79:64] = temp4[16:1];
DEST[95:80] = temp5[16:1];
DEST[111:96] = temp6[16:1];
DEST[127:112] = temp7[16:1];

gilgil
Beginner
209 Views

I still do not understand...

I try the next piece of code
float factor = 1.f;
__m128i vFactor = _mm_set1_epi16(factor*(1<<14)); // Using fixed point..

__m128i inputVec = _mm_set_epi16(32,54,124,75,35,235,244,36);

__m128i resultVec = _mm_mulhrs_epi16(inputVec,vFactor);

By your explanation I should get resultVec = inputVec but the result elements are actually half the original values..

sirrida
Beginner
209 Views
If you carefully read the documentation you will notice an additional hidden shift by 1.
The temp*[16:1] can be read as (temp*[31:0]>>1)[15:0].

It might make sense to make the documentation more evident about this.
gilgil
Beginner
209 Views
I agree the documentation for this function is not the best one.
Reply