>>...So what kind of C code

Yunqi_Z_ · ‎05-12-2013

Hi all,

I have a question about shufps instructions. So what kind of C code would usually generate shufps by the compiler?

Thank you for your help!

TimP · ‎05-12-2013

I'm sure we can't guess your target without hints. Code which gathers or scatters elements to and from a packed vector under SSE2 code option, possibly with the help of #pragma vector always. Setting SSE4 options would promote newer instructions for the same purpose.

SergeyKostrov · ‎05-12-2013

>>...So what kind of C code would usually generate shufps by the compiler? I agree with Tim that your question is really hard to answer. So, I've looked at Intel headers with intrinsic functions and here are some details: immintrin.h ... /* * Shuffle Packed Single Precision Floating-Point Values * **** VSHUFPS ymm1, ymm2, ymm3/m256, imm8 * Moves two of the four packed single-precision floating-point values * from each double qword of the first source operand into the low * quadword of each double qword of the destination; moves two of the four * packed single-precision floating-point values from each double qword of * the second source operand into to the high quadword of each double qword * of the destination. The selector operand determines which values are moved * to the destination. */ extern __m256 __ICL_INTRINCC _mm256_shuffle_ps(__m256, __m256, const int); ... A very generic answer could look like: A C/C++ compiler will generate the instruction if C/C++ code uses _mm256_shuffle_ps intrinsic function, or has inline assembler code for the instruction ( it is assumed that support for generation of AVX instructions is enabled ). Also, you need to look at Intel Instruction Set Reference Manual ( Volumes 2A, 2B and 2C ) for more detailed decription of the instruction.

Yunqi_Z_ · ‎05-12-2013

Sorry for the confusion. I meant to ask what kind of C code could be possibly translated into shufps by the compiler. I think the problem has been solved. Thank you guys! :-)

About shufps instruction