- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have a question about shufps instructions. So what kind of C code would usually generate shufps by the compiler?
Thank you for your help!
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm sure we can't guess your target without hints. Code which gathers or scatters elements to and from a packed vector under SSE2 code option, possibly with the help of #pragma vector always. Setting SSE4 options would promote newer instructions for the same purpose.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>...So what kind of C code would usually generate shufps by the compiler?
I agree with Tim that your question is really hard to answer. So, I've looked at Intel headers with intrinsic functions and here are some details:
immintrin.h
...
/*
* Shuffle Packed Single Precision Floating-Point Values
* **** VSHUFPS ymm1, ymm2, ymm3/m256, imm8
* Moves two of the four packed single-precision floating-point values
* from each double qword of the first source operand into the low
* quadword of each double qword of the destination; moves two of the four
* packed single-precision floating-point values from each double qword of
* the second source operand into to the high quadword of each double qword
* of the destination. The selector operand determines which values are moved
* to the destination.
*/
extern __m256 __ICL_INTRINCC _mm256_shuffle_ps(__m256, __m256, const int);
...
A very generic answer could look like: A C/C++ compiler will generate the instruction if C/C++ code uses _mm256_shuffle_ps intrinsic function, or has inline assembler code for the instruction ( it is assumed that support for generation of AVX instructions is enabled ).
Also, you need to look at Intel Instruction Set Reference Manual ( Volumes 2A, 2B and 2C ) for more detailed decription of the instruction.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry for the confusion. I meant to ask what kind of C code could be possibly translated into shufps by the compiler. I think the problem has been solved. Thank you guys! :-)
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page