Community
cancel
Showing results for 
Search instead for 
Did you mean: 
gilgil
Beginner
103 Views

Intrinsic guide 2.6 error in documentation

In the documentation the intrinsic _mm_mulhrs_epi16 the shift right should be 15 and not 14.

0 Kudos
4 Replies
Patrick_K_Intel
Employee
103 Views

14 bits is correct. See the Instruction Set Reference in the Software Developer's Manual:http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

PMULHRSW (with 128-bit operand)

temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;
temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;
temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;
temp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;
temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;
temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;
temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;
temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;
DEST[15:0] = temp0[16:1];
DEST[31:16] = temp1[16:1];
DEST[47:32] = temp2[16:1];
DEST[63:48] = temp3[16:1];
DEST[79:64] = temp4[16:1];
DEST[95:80] = temp5[16:1];
DEST[111:96] = temp6[16:1];
DEST[127:112] = temp7[16:1];

gilgil
Beginner
103 Views

I still do not understand...

I try the next piece of code
float factor = 1.f;
__m128i vFactor = _mm_set1_epi16(factor*(1<<14)); // Using fixed point..

__m128i inputVec = _mm_set_epi16(32,54,124,75,35,235,244,36);

__m128i resultVec = _mm_mulhrs_epi16(inputVec,vFactor);

By your explanation I should get resultVec = inputVec but the result elements are actually half the original values..

sirrida
Beginner
103 Views

If you carefully read the documentation you will notice an additional hidden shift by 1.
The temp*[16:1] can be read as (temp*[31:0]>>1)[15:0].

It might make sense to make the documentation more evident about this.
gilgil
Beginner
103 Views

I agree the documentation for this function is not the best one.
Reply