need help optimizing a function...

Joe_Wezorek — Tue, 23 Aug 2011 18:00:20 GMT

I need help optimizing an implementation of a function to compute the normal probability density function of a vector of n values. This is for use in a mex call from Matlab, so basically my goal is for the IPP-implemented version of the function to be faster than Matlab's "normpdf" toolbox function.

I have an implementation working that uses ippsSub, not in place, and then ippsSqr, ippsMulC, ippsExp, and ippsDiv, all in place. I get the values in from Matlab -- I don't copy them -- then I use a Mathworks provided call for allocating an output array, then call the not-inplace subtraction from the input to the output buffer, and then do the rest of the arithmetic operations in place on the output buffer.

The above results in an implementation that is faster than Matlab if n is less than about a million for single precision data. Above a million Matlab becomes consistently faster than my implementation. To me this doesn't make sense. If anything I would think this IPP-based implementation I am describing would be slower for small n's and faster for high n's. Can anyone explain the behavior I'm seeing? or provide suggestions for optimizations I can make?

Also, in the IPP "Reference Manual, Volume 1: Signal Processing" from March 2009, I see that there used to be a call in the "Speech Recognition Functions" section called ippsExpNegSqr, but it doesn't seem to be around any more. What happened to this function?

need help optimizing a function...

Naveen_G_Intel — Thu, 25 Aug 2011 10:53:56 GMT

Hi,

Regarding speech recognition functions

As in the IPP release notes - The speech recognition functions (ippSR domain) are not part of this release; this domain will continue to be supported in the IPP 6.1 product.

http://software.intel.com/en-us/articles/intel-ipp-70-library-release-notes/

Regards,

Naveen Gv

need help optimizing a function...

levicki — Thu, 25 Aug 2011 22:04:01 GMT

Regarding your question about n, if I am not mistaken IPP is faster for smaller sets, MKL on the other hand should be faster for larger sets. Your mileage may vary.

topic need help optimizing a function... in Intel® Integrated Performance Primitives

need help optimizing a function...

need help optimizing a function...

need help optimizing a function...