Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
17 Views

ippsFIRSRGetSize results in extremely large bufSize

Hi,

Using IPPS version 2018 update 3 and 2019 update 1, both with the same result for the following call.

ippsFIRSRGetSize (TAPS_LEN,  ipp32f ,  &specSize,  &bufSize );

No matter what size TAPS_LEN the bufSize is >32k. This is an extremely large buffer for e.g. a 4 tap FIR filter. Both specSize and bufSize is of type int as documentation says. The general purpose IIR filter of the same order takes up much less memory.

Is this an error in IPPS? Or what could the reason be?

0 Kudos
6 Replies
Highlighted
Employee
17 Views

Hi Bo.

Many customers of IPP needs so named in-place mode of functions(pDst=pSrc) when source and destination vector is the same by some reasons. To process properly this situation and store temporal data FIRSR needs about ~32K(L1 size) in reserved buffer. The API of ippsFIRSRGetSize does not have information about re-place or in-place mode and requests maximum buffer size.

Thanks.

 

0 Kudos
Highlighted
Beginner
17 Views

Hi Andrey,

Is there a work around for this. We are not using inline processing. Bu we are working with a hard limit of < 32K for mallocs. The reason is that we are working in an MS APO context, so we MUST use AERT_Allocate to allocate memory - which is limited to 32K. In addition a few bytes are wasted due to Ipp's memory alignment requirements.

https://docs.microsoft.com/en-us/windows/desktop/api/baseaudioprocessingobject/nf-baseaudioprocessin...

Best regards

Troels Blum

 

0 Kudos
Highlighted
17 Views

Hi Andrey, thank you for your answer.

Just to inform you, Troels Blum is my colleague and I join his question.

//Bo

0 Kudos
Highlighted
Employee
17 Views

Hi Bo and Troels,

How many taps do you use?

ippsFIR internaly, in addition to "inplace" mode support, has at least 3 different algorithm implementations: for rather small filter orders (criterion also depends on cpu arch) ~<32 it uses so called "vertical" unrolling, for ~32- ~64 - so called "horizontal" unrolling, and, then, for higher filter orders - FFT (convolution theorem) based algorithm. I guess it's clear that the last one also requires more memory for internal buffers than the first two.

regards, Igor

0 Kudos
Highlighted
Moderator
17 Views

Hello  Bo and Troels,

in the case if you are really interesting into this feature implementation, Could you submit this feature request to the Intel online service center?  

 

0 Kudos
Highlighted
17 Views

Hi Igor and Gennady F.,

Thank you for your answers. We will consider our options and get back to you if it is relevant.

//Bo

0 Kudos