Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Seek help for AVX shuffle parameters

Zhibin_Niu__Intel_
140 Views
hi, guys,
I write an AVX code which need a shuffle, but i can not write out how the parameter should be set, Could anyone give me some help ?
the source is :
U: u7, u6,u5,u4,u3,u2,u1,u0;
V: v7,v6,v5,v4,v3,v2,v1,v0;
N: n7,n6,n5,n4,n3,n2,n1,n0;
I want to shuffle it to :
v1,u4,n4,v4,u4,n0,v0,u0;
u6,n2,v2,u2,n5,v5,u5,n1;
n7,v7,u7,n3,v3,u3,n6,v6;
0 Kudos
3 Replies
sirrida
Beginner
140 Views
Your first result vector probably should be v1,u1,n4,v4,u4,n0,v0,u0;.
Please look at this linked article where a very similar problem is solved.
You might need to modify the load operations by using unpcklps.
Zhibin_Niu__Intel_
140 Views
thanks for pointing out the mistake spelling. yes, it is u1.
I have write out shuffle code, but inorder to have a sequence like the artical, i have to use 9 permute, 6 unpack and the 6 shuffle. In the end , my AVX code runs just similar time with the sse one.
TimP
Black Belt
140 Views
The AVX architecture does support sse shuffles well. In some cases where the compiler generates sse shuffles only when VECTOR ALWAYS is set (because it would be slow on original sse CPUs), the sse VECTOR ALWAYS code runs as fast on AVX capable CPU as AVX options could do. In case the shuffle performance is limited by the issue rate for memory read, the AVX CPU doubles that rate.
Reply