Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

Intel C++ : _mm256_set1_ps suboptimal ?

bronxzv
New Contributor II
558 Views

I'm in the process of porting a (huge) piece of code from SSE to AVX, looking at the ASM generated by the compiler (Intel C++ Pro 11.1 build #38 IA32 / Windows) I have just remarked that _mm256_set1_ps spits outthis convoluted sequence :

movss xmm0, DWORD PTR [edi+eax*4]

unpcklps xmm0, xmm0

movlhps xmm0, xmm0

vinsertf128 ymm1, ymm0, xmm0, 1


instead ofthemuch simpler :

vbroadcastss ymm0, DWORD PTR [edi+eax*4]


did I miss something or is it simply something that should be improved in a forthcoming version of the compiler ?

0 Kudos
0 Replies
Reply