Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Yunqi_Z_
Beginner
221 Views

About shufps instruction

Hi all,

I have a question about shufps instructions. So what kind of C code would usually generate shufps by the compiler?

Thank you for your help!

0 Kudos
3 Replies
TimP
Black Belt
221 Views

I'm sure we can't guess your target without hints.  Code which gathers or scatters elements to and from a packed vector under SSE2 code option, possibly with the help of #pragma vector always.  Setting SSE4 options would promote newer instructions for the same purpose.

SergeyKostrov
Valued Contributor II
221 Views

>>...So what kind of C code would usually generate shufps by the compiler? I agree with Tim that your question is really hard to answer. So, I've looked at Intel headers with intrinsic functions and here are some details: immintrin.h ... /* * Shuffle Packed Single Precision Floating-Point Values * **** VSHUFPS ymm1, ymm2, ymm3/m256, imm8 * Moves two of the four packed single-precision floating-point values * from each double qword of the first source operand into the low * quadword of each double qword of the destination; moves two of the four * packed single-precision floating-point values from each double qword of * the second source operand into to the high quadword of each double qword * of the destination. The selector operand determines which values are moved * to the destination. */ extern __m256 __ICL_INTRINCC _mm256_shuffle_ps(__m256, __m256, const int); ... A very generic answer could look like: A C/C++ compiler will generate the instruction if C/C++ code uses _mm256_shuffle_ps intrinsic function, or has inline assembler code for the instruction ( it is assumed that support for generation of AVX instructions is enabled ). Also, you need to look at Intel Instruction Set Reference Manual ( Volumes 2A, 2B and 2C ) for more detailed decription of the instruction.
Yunqi_Z_
Beginner
221 Views

Sorry for the confusion. I meant to ask what kind of C code could be possibly translated into shufps by the compiler. I think the problem has been solved. Thank you guys! :-)

Reply