- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Using IPPS version 2018 update 3 and 2019 update 1, both with the same result for the following call.
ippsFIRSRGetSize (TAPS_LEN, ipp32f , &specSize, &bufSize );
No matter what size TAPS_LEN the bufSize is >32k. This is an extremely large buffer for e.g. a 4 tap FIR filter. Both specSize and bufSize is of type int as documentation says. The general purpose IIR filter of the same order takes up much less memory.
Is this an error in IPPS? Or what could the reason be?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Bo.
Many customers of IPP needs so named in-place mode of functions(pDst=pSrc) when source and destination vector is the same by some reasons. To process properly this situation and store temporal data FIRSR needs about ~32K(L1 size) in reserved buffer. The API of ippsFIRSRGetSize does not have information about re-place or in-place mode and requests maximum buffer size.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andrey,
Is there a work around for this. We are not using inline processing. Bu we are working with a hard limit of < 32K for mallocs. The reason is that we are working in an MS APO context, so we MUST use AERT_Allocate to allocate memory - which is limited to 32K. In addition a few bytes are wasted due to Ipp's memory alignment requirements.
Best regards
Troels Blum
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andrey, thank you for your answer.
Just to inform you, Troels Blum is my colleague and I join his question.
//Bo
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Bo and Troels,
How many taps do you use?
ippsFIR internaly, in addition to "inplace" mode support, has at least 3 different algorithm implementations: for rather small filter orders (criterion also depends on cpu arch) ~<32 it uses so called "vertical" unrolling, for ~32- ~64 - so called "horizontal" unrolling, and, then, for higher filter orders - FFT (convolution theorem) based algorithm. I guess it's clear that the last one also requires more memory for internal buffers than the first two.
regards, Igor
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Bo and Troels,
in the case if you are really interesting into this feature implementation, Could you submit this feature request to the Intel online service center?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Igor and Gennady F.,
Thank you for your answers. We will consider our options and get back to you if it is relevant.
//Bo
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page