Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
1112 Discussions

IF -ELSE condition using MMX tech.intrinsics

Smart_Lubobya
Beginner
1,094 Views

need help, _mm_mullo_pi16() could not multiply big numbers.any suggestion to what i should do? just how do i multiply two 1-D arrays with big value(positive and negative)? x0,b,b1,s0,s1 are vectors (arrays) and _f, Adjust[], _scale are scalar integers.
// C++ codes

int x0 = s0 + s1;

if(x0 < 0)

short b = (short)(-( (((-x0) * Adjust[_qm]) + _f) >> _scale ));

else

short b = (short)( ((x0 * Adjust[_qm]) + _f) >> _scale );

// MMX intrinsic codes

__m64*b1 = (__m64*)b;

__m64 s0,s1,s2,s3,x0;

j=0;

__m64 r0,r1,t0,t1,t2,p0,p1;

r0 =_mm_set_pi16(Adjust[_qm],Adjust[_qm],Adjust[_qm],Adjust[_qm]);

r1 =_mm_set_pi16(_f,_f,_f,_f);

x0 =_mm_add_pi16(s0,s1);

t1 =_mm_cmpgt_pi16(_mm_set1_pi16(0),x0);

t2 = _mm_mullo_pi16((_mm_sub_pi16(_mm_setzero_si64(),x0)),r0);

t0 = _mm_mullo_pi16(x0,r0);

p0 =_mm_srai_pi16(,_scale );

p1 =_mm_srai_pi16(_mm_add_pi16(t2,r1),_scale );

b1 =_mm_or_si64(_mm_and_si64(t1,p0),_mm_andnot_si64(t1,p1));

0 Kudos
7 Replies
neni
New Contributor II
1,094 Views
what exactly are you trying to do? (in C)
there is also a mulhi for the upper part
0 Kudos
Smart_Lubobya
Beginner
1,094 Views
x0 ={21000, 23000,-19000,-14000}
r0 ={11912,11912,11912,11912}

what i want is to multiply x0 and r0 using _m64 data type ie using _mm_mullo_ep16() and _mm_mulhi_ep16().
0 Kudos
Thomas_W_Intel
Employee
1,094 Views
The results of your computation require 32 bit. If you compute the lower and the upper bits separately, this requires twice the number of multiplications. Instead, you can use SSE, where you can do 4 32-bit multiplications in 1 operation:

__m128i x1 = _mm_loadu_si128(&x0); // loads 16 Bytes (only 8 are used)
__m128i r1 = _mm_loadu_si128(&r0); // loads 16 Bytes (only 8 are used)
__m128i x_sse = _mm_cvtepi16_epi32(x1); // convert lower 4 16-bit values to 32-bit values with sign-extension
__m128i r_sse = _mm_cvtepi16_epi32(r1); // convert lower 4 16-bit values to 32-bit values with sign-extension
__m128i res_sse = _mm_mullo_epi32(x_sse, r_sse); // multiply 4 signed 32-bit values
0 Kudos
Smart_Lubobya
Beginner
1,094 Views
seems MS visual studio 2008 does not recognise _mm_mullo_epi32() , only _mm_mullo_epi16. equally _mm_cvtepi16_epi32() is not recognised on __m128i data types. how do i proceed?
0 Kudos
Brijender_B_Intel
1,094 Views
Make sure smmintrin.h is included. These are SSE4.x instructions. MS VS 2008 supports them.
0 Kudos
Smart_Lubobya
Beginner
1,094 Views
thanks for all the help given. my processor supports only MMX, SSE,SSE2,SSE3,SSSE3 and EM64T instructions. SSE4.X is not supported. how can i resolve the multiplication with the availabe instructions?
0 Kudos
Thomas_W_Intel
Employee
1,094 Views

You can compute the lower 16 bits and the upper 16 bits of the 32-bit results separately. Afterwards, you will need to interleave them in order to get the full 32-bit results. Something like this should work:

_m128i hi = _mm_mulhi_epi16(a, b);

_m128i lo = _mm_mullo_epi16(a,b);

_m128i r0 = _mm_unpacklo_epi16(lo, hi);

_m128i r1 = _mm_unpackhi_epi16(lo,hi);

a and b contain 8 16-bit values that you would like to multiply. r0 contains the first 4 32-bit results; r1 contains the remaining 4 32-bit results. These instructions come with the SSE2 instruction set.

0 Kudos
Reply