- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am writing a paper about interval arithmetic using SSE2 instructions which is part of my library for exact real number computations, and while doing it I realized SSE3 could have been quite helpful if it were done slightly differently.
My exact question is: I am curious why did Intel prefer to include a addsub instruction instead of multiplication with one of the arguments negated, i.e. something like
mulpnpd xmm1,xmm2
giving xmm1.1 * xmm2.1, (-xmm1.0) * xmm2.0
Using this the addsubpd instruction would not be needed to compute complex multiplications and divisions.
What I believe to be more important, however, is the behavior of Intel's sample SSE3 code for complex multiplication when the rounding mode is set to something other than rounding-to-nearest. More specifically, the SSE3 complex multiplication code would not compute upper bounds for the product when the rounding is to +inf, nor lower bounds for -inf, because the rounding of the multiplication that computes the substracted component would be rounded incorrectly.
This would not be the case if a mulpn instruction were available instead of addsub, because the result of the multiplication would be rounded the correct way. A mulpn would also be very useful for single or double precision interval arithmetic using the SIMD registers.
Does anyone know why Intel preferred addsub to this?
I am writing a paper about interval arithmetic using SSE2 instructions which is part of my library for exact real number computations, and while doing it I realized SSE3 could have been quite helpful if it were done slightly differently.
My exact question is: I am curious why did Intel prefer to include a addsub instruction instead of multiplication with one of the arguments negated, i.e. something like
mulpnpd xmm1,xmm2
giving xmm1.1 * xmm2.1, (-xmm1.0) * xmm2.0
Using this the addsubpd instruction would not be needed to compute complex multiplications and divisions.
What I believe to be more important, however, is the behavior of Intel's sample SSE3 code for complex multiplication when the rounding mode is set to something other than rounding-to-nearest. More specifically, the SSE3 complex multiplication code would not compute upper bounds for the product when the rounding is to +inf, nor lower bounds for -inf, because the rounding of the multiplication that computes the substracted component would be rounded incorrectly.
This would not be the case if a mulpn instruction were available instead of addsub, because the result of the multiplication would be rounded the correct way. A mulpn would also be very useful for single or double precision interval arithmetic using the SIMD registers.
Does anyone know why Intel preferred addsub to this?
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Greetings from Intel Software Network Support. We will check with our engineering contacts and let you know what we find out.
Regards,
Lexi S.
Intel Software Network Support
email: ISN.support@intel.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Our engineering contacts responded:
The addsub was added for complex arithmetic. It seemed more natural to handle the "-" with an add type of instruction, rather than a mul as described above. Interval arithmetic was not a factor at all in the decision to add this instruction, but significant improvement in math libraries were obtained with these instructions, confirming that they are useful.
We are always looking for new instructions and feedback to make our architectures better suited to our customers' needs. If you would like to write up your requestwith a bit more detail and send it to us here, we would be glad to forward the information to ourarchitects to consider the request for future architectures. We would also need to know what you want to use it for.
Regards,
Lexi S.
Intel Software Network Support
Message Edited by intel.software.network.support on 11-15-2005 11:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I know this is an old post but I am curious to hear if the author has updated his code. There is an instruction BLENDVPD in SSE 4.1 which makes conditional selection of double precision values easier.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page