Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Ankündigungen
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
1135 Diskussionen

PADDW __m128i _mm_add_epi16 ( __m128i a, __m128i b) doubt

inteleverywhere
Einsteiger
810Aufrufe

There are two versions for the same intrinsic. for example vpaddw and paddw. Is there any performance gain if vpaddw is used instead of paddw (_mm_add_epi16). Are there intrinsic for vpaddw.

VPADDW (VEX.128 encoded version)

DEST[15:0]-- SRC1[15:0]+SRC2[15:0]

DEST[31:16]-- SRC1[31:16]+SRC2[31:16]

DEST[47:32]-- SRC1[47:32]+SRC2[47:32]

DEST[63:48]-- SRC1[63:48]+SRC2[63:48]

DEST[79:64]-- SRC1[79:64]+SRC2[79:64]

DEST[95:80]-- SRC1[95:80]+SRC2[95:80]

DEST[111:96]-- SRC1[111:96]+SRC2[111:96]

DEST[127:112]-- SRC1[127:112]+SRC2[127:112]

DEST[255:128]-- 0

PADDW (128-bit Legacy SSE version)

DEST[15:0]-- DEST[15:0]+SRC[15:0]

DEST[31:16]-- DEST[31:16]+SRC[31:16]

DEST[47:32]-- DEST[47:32]+SRC[47:32]

DEST[63:48]-- DEST[63:48]+SRC[63:48]

DEST[79:64]-- DEST[79:64]+SRC[79:64]

DEST[95:80]-- DEST[95:80]+SRC[95:80]

DEST[111:96]-- DEST[111:96]+SRC[111:96]

DEST[127:112]-- DEST[127:112]+SRC[127:112]

DEST[255:128] (Unmodified)

PADDW __m128i _mm_add_epi16 ( __m128i a, __m128i b)


thanks

0 Kudos
1 Antworten
Brijender_B_Intel
Mitarbeiter
810Aufrufe

Performance just does not depend on the instruction and also in the context which it is used. You need to give a shot on your application. compiler can generate AVX instruction for same application if you compile with arch:AVX.

Antworten