Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
1079 Discussions

PADDW __m128i _mm_add_epi16 ( __m128i a, __m128i b) doubt

inteleverywhere
Beginner
277 Views

There are two versions for the same intrinsic. for example vpaddw and paddw. Is there any performance gain if vpaddw is used instead of paddw (_mm_add_epi16). Are there intrinsic for vpaddw.

VPADDW (VEX.128 encoded version)

DEST[15:0]-- SRC1[15:0]+SRC2[15:0]

DEST[31:16]-- SRC1[31:16]+SRC2[31:16]

DEST[47:32]-- SRC1[47:32]+SRC2[47:32]

DEST[63:48]-- SRC1[63:48]+SRC2[63:48]

DEST[79:64]-- SRC1[79:64]+SRC2[79:64]

DEST[95:80]-- SRC1[95:80]+SRC2[95:80]

DEST[111:96]-- SRC1[111:96]+SRC2[111:96]

DEST[127:112]-- SRC1[127:112]+SRC2[127:112]

DEST[255:128]-- 0

PADDW (128-bit Legacy SSE version)

DEST[15:0]-- DEST[15:0]+SRC[15:0]

DEST[31:16]-- DEST[31:16]+SRC[31:16]

DEST[47:32]-- DEST[47:32]+SRC[47:32]

DEST[63:48]-- DEST[63:48]+SRC[63:48]

DEST[79:64]-- DEST[79:64]+SRC[79:64]

DEST[95:80]-- DEST[95:80]+SRC[95:80]

DEST[111:96]-- DEST[111:96]+SRC[111:96]

DEST[127:112]-- DEST[127:112]+SRC[127:112]

DEST[255:128] (Unmodified)

PADDW __m128i _mm_add_epi16 ( __m128i a, __m128i b)


thanks

0 Kudos
1 Reply
Brijender_B_Intel
277 Views

Performance just does not depend on the instruction and also in the context which it is used. You need to give a shot on your application. compiler can generate AVX instruction for same application if you compile with arch:AVX.

Reply