Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development SDKs and Libraries
- Intel® Integrated Performance Primitives
- Ipps--multiplying two 16-bit numbers needs 32-bit product

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Derek_Woodman

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-16-2010
04:59 PM

73 Views

Ipps--multiplying two 16-bit numbers needs 32-bit product

I want to use IPPS to do vector arithmetic. I found the following function:

IppStatus ippsMul_16s_Sfs(const Ipp16s* pSrc1, const Ipp16s* pSrc2, Ipp16s* pDst, int len, int scaleFactor);

However, what if the two numbers I am multiplying require 32-bits for the product? I want something like:

IppStatus ippsMul_16s32s(const Ipp16s* pSrc1, const Ipp16s* pSrc2, Ipp32s* pDst, int len);

Do these function exist? Am I just not looking in the correct place?

Thanks!

Link Copied

3 Replies

Derek_Woodman

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-16-2010
05:04 PM

73 Views

IppStatus ippsMul_16s32s_Sfs(const Ipp16s* pSrc1, const Ipp16s* pSrc2, Ipp32s* pDst, int len, int scaleFactor);

That should work for multiplying two vectors. But what about the MulC variant. Do I have to copy the 16bit vector into a 32bit vector and then just use the 32s variant?

Also, I kinda have a general question. How long does it take to copy vectors? How about inplace functions versus not-in-place functions. Do these have very high performace differences?

Derek_Woodman

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-16-2010
05:09 PM

73 Views

But I meant to comment on the functions I found:

When there is a scalefactor on the end, does this affect the performance much? I really don't need it, but there isn't a version without the scalefactor. If I set it to 1, does this really do the same thing as not having a scale factor?

Also, there is a sub function that looks like this:

IppStatus ippsSub_16s32f(const Ipp16s* pSrc1, const Ipp16s* pSrc2, Ipp32f* pDst, int len);

I don't really want a floating point representation because I am just working with integers. Why isn't there a 32s version?

renegr

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-17-2010
04:34 AM

73 Views

If you need to process data on higher bitdepth it often will be faster to do it once at the beginning of the algorithm, then process the 32bit data and at last convert back to 16bit.

Every function which has different input-/output bit depths needs to convert each value (in case of SSE2 8 values) into this higher bitdepth before doing the operation. This will increase the latency very much and therefore reduces the performance.

Of course there are some functions with different in-/output bitdepths missing, but I think even Intel needs time to react on customer wishes and implement them. So maybe they will be there sometimes :)

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.