- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In my program (written in assembly language)
I encounted a problem:
I load four integer (32bit) into xmm register
and I want to use the four integers seperately.
In my program I have to shift 32 bit each time ,it is very time consuming.
like this
movd %xmm5, %eax
60 psrldq $4, %xmm5
61 movsd (%rdi,%rax,8),%xmm2 #xvalue ---one a time
62
63 movd %xmm5,%eax
64 psrldq $4, %xmm5
65 movhpd (%rdi,%rax,8),%xmm2
66
67 mulpd %xmm2,%xmm0
68
69
70 movd %xmm5,%eax
71 psrldq $4,%xmm5
72 movsd (%rdi,%rax,8),%xmm6
73
74 movd %xmm5,%eax
75 movhpd (%rdi,%rax,8),%xmm6
Is there any instruction that I don't need to shift , and get the appropriate part of bits I need?
I haven't find such instruction so far ?
Does anyone know this instruction?
I encounted a problem:
I load four integer (32bit) into xmm register
and I want to use the four integers seperately.
In my program I have to shift 32 bit each time ,it is very time consuming.
like this
movd %xmm5, %eax
60 psrldq $4, %xmm5
61 movsd (%rdi,%rax,8),%xmm2 #xvalue ---one a time
62
63 movd %xmm5,%eax
64 psrldq $4, %xmm5
65 movhpd (%rdi,%rax,8),%xmm2
66
67 mulpd %xmm2,%xmm0
68
69
70 movd %xmm5,%eax
71 psrldq $4,%xmm5
72 movsd (%rdi,%rax,8),%xmm6
73
74 movd %xmm5,%eax
75 movhpd (%rdi,%rax,8),%xmm6
Is there any instruction that I don't need to shift , and get the appropriate part of bits I need?
I haven't find such instruction so far ?
Does anyone know this instruction?
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm not sure if I understand your problem correctly.
Do you want to insert 4 different 32bit integers in a XMM register? Then pinsrd might be the instruction that you are looking for.
If you want to have 4 copies of the same value, you could use vbroadcastss.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply.
It's just opposite to insert.
I want to extract four integers from xmm registers into four 32bit registers .
I look up the instruction set manual and find that pextrw can do this .
But it's also time consuming, it is decded into 2uops,and latency is 3 clocks in procesor nehalem.
It's just opposite to insert.
I want to extract four integers from xmm registers into four 32bit registers .
I look up the instruction set manual and find that pextrw can do this .
But it's also time consuming, it is decded into 2uops,and latency is 3 clocks in procesor nehalem.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page