Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Feature request: partial sum

Dan_R_
Beginner
995 Views

Hi, 

It would be great to have a partial sum (a.k.a. cumulative sum) added to ipps for 1d arrays, similar to C++'s std::partial_sum/inclusive_scan.

The current closest alternative which I've found is ippIntegral under ipp's image processing, but it has some drawbacks:

1) When using the function for summing just one row, the i-th element of the row isn't included in the i-th element of the partial sum (similar to C++'s std::exclusive_scan), making it awkward to use when what you want is an inclusive scan.

2) It supports 2d arrays (for images), which means:

- The interface is overly-complex for the simple common use-case of partial-summing a 1d array

- Some performance is probably left on the table since the algorithm is overly-generic for the 1d use case

I suggest adding both flavors of the 1d partial sum, both the inclusive and the exclusive one.

 

I've seen additional indications of interest in this functionality in these posts:

https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/310184

https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/306961

https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/309742

 

Thanks!

Dan

0 Kudos
6 Replies
Jonghak_K_Intel
Employee
995 Views

Hi Dan,

 

thanks for your suggestion.

Could you share more of your situation like, what project are you working on that needs this particular function?

It sounds great but it needs to go through internal discussions to put it on the process.

 

Could you submit your request through https://supporttickets.intel.com/?lang=en-US ?

This is official online service center.

 

Thank you

0 Kudos
Dan_R_
Beginner
995 Views

Done, the request number is 03510375.

As for the use case - we make audio processing effects in Sound Radix, for which we implement various algorithms. This is a function which we would use in such algorithms.

Thanks,

Dan

0 Kudos
Jonghak_K_Intel
Employee
995 Views

The feature request has been escalated and is being processed in the Service Center. 

 

0 Kudos
Dan_R_
Beginner
995 Views

Thanks!

Dan

0 Kudos
Jonghak_K_Intel
Employee
995 Views

Hi Dan, 

 

unfortunately, the engineering decided not to implement this feature. 

The reason is that It can’t be vectorized because of the feedback dependency (each next point depends on all previous) – therefore we can’t expect any significant speedup because of optimization in IPP.

Rather we can use very simple C-loop and try to compile it with ICC with the corresponding arch switch – but we don’t guarantee that it will create any significant speedup over MS std lib.

 

Because of the reason above we decided not to implement this functionality in IPP.

 

thank you for your request and using IPP. 

0 Kudos
BMart1
New Contributor II
995 Views

Can't you adapt https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch39.html ? Is the 2d ippi version vectorized?

0 Kudos
Reply