Software Archive
Read-only legacy content
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
17060 Discussions

64 bit element "duplication" inside zmm register for complex multiplication

Alastair_M_
New Contributor I
764 Views

Dear all,

I am wondering about the best known method for implementing the following operation as part of a complex number multiplication. The input value is a zmm register which contains 4 double complex numbers in the following arrangement {c1.re,c1.im,c2.re,c2.im,c3.re,c3.im,c4.re,c4.im}

I want to separate these into two registers containing all four real parts duplicated and also also for imaginary parts duplicated. I.e.

{c1.re,c1.im,c2.re,c2.im,c3.re,c3.im,c4.re,c4.im} -> {c1.re,c1.re,c2.re,c2.re,c3.re,c3.re,c4.re,c4.re} and {c1.im,c1.im,c2.im,c2.im,c3.im,c3.im,c4.im,c4.im}

At present I am using the following pattern:

one_re = (__m512d)_mm512_shuffle_epi32((__m512i)one,0x44);
one_im = (__m512d)_mm512_shuffle_epi32((__m512i)one,0xEE);

It feels like I might be missing something.  Is this the most efficient method for this operation?

Best regards,

Alastair

 

0 Kudos
1 Solution
Alastair_M_
New Contributor I
764 Views

I found the answer to this using a masked swizzle, which seems very slightly faster.

__mmask8 real_mask = (__mmask8)_mm512_int2mask(170);
__mmask8 imag_mask = (__mmask8)_mm512_int2mask(85);

__m512d input = {0,1,2,3,4,5,6,7};

__m512d real_parts = _mm512_mask_swizzle_pd(input,real_mask,input,_MM_SWIZ_REG_CDAB);
__m512d imag_parts = _mm512_mask_swizzle_pd(input,imag_mask,input,_MM_SWIZ_REG_CDAB);

 

Alastair

 

View solution in original post

0 Kudos
2 Replies
Alastair_M_
New Contributor I
765 Views

I found the answer to this using a masked swizzle, which seems very slightly faster.

__mmask8 real_mask = (__mmask8)_mm512_int2mask(170);
__mmask8 imag_mask = (__mmask8)_mm512_int2mask(85);

__m512d input = {0,1,2,3,4,5,6,7};

__m512d real_parts = _mm512_mask_swizzle_pd(input,real_mask,input,_MM_SWIZ_REG_CDAB);
__m512d imag_parts = _mm512_mask_swizzle_pd(input,imag_mask,input,_MM_SWIZ_REG_CDAB);

 

Alastair

 

0 Kudos
TaylorIoTKidd
New Contributor I
764 Views

Alastair,

Thank you for letting the community know.

Regards
--
Taylor

0 Kudos
Reply