I encounted a problem:
I load four integer (32bit) into xmm register
and I want to use the four integers seperately.
In my program I have to shift 32 bit each time ,it is very time consuming.
movd %xmm5, %eax
60 psrldq $4, %xmm5
61 movsd (%rdi,%rax,8),%xmm2 #xvalue ---one a time
63 movd %xmm5,%eax
64 psrldq $4, %xmm5
65 movhpd (%rdi,%rax,8),%xmm2
67 mulpd %xmm2,%xmm0
70 movd %xmm5,%eax
71 psrldq $4,%xmm5
72 movsd (%rdi,%rax,8),%xmm6
74 movd %xmm5,%eax
75 movhpd (%rdi,%rax,8),%xmm6
Is there any instruction that I don't need to shift , and get the appropriate part of bits I need?
I haven't find such instruction so far ?
Does anyone know this instruction?
I'm not sure if I understand your problem correctly.
Do you want to insert 4 different 32bit integers in a XMM register? Then pinsrd might be the instruction that you are looking for.
If you want to have 4 copies of the same value, you could use vbroadcastss.
It's just opposite to insert.
I want to extract four integers from xmm registers into four 32bit registers .
I look up the instruction set manual and find that pextrw can do this .
But it's also time consuming, it is decded into 2uops,and latency is 3 clocks in procesor nehalem.