Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Bug in ippsFIRMR_16sc

kwadolowski
Novice
1,939 Views

I am seeing inconsistent behavior between two versions of the FIRMR function that performs multi rate filtering. The version that works as expected is the ippsFIRMR_32fc function, the version that does not work as expected is ippsFIRMR_16sc.
Below is the code used to show this difference, followed by a print out of the results. After the code and results I will highlight the problem areas.
Note that the fc32 and 16sc versions are practically identical.

Code :

#include <iomanip>
#include <iostream>
extern "C"
{
#include <ipps.h>
}

void fc32_ver();
void sc16_ver();

int main() {
fc32_ver();
std::cout << std::endl << std::endl;
sc16_ver();
return 0;
}

#define PRINT_PTR(NAME, PTR, LEN) \
{ \
std::cout << NAME << ": "; \
for (size_t iii = 0; iii < LEN; ++iii) \
std::cout << "(" << std::setw(2) << PTR[iii].re << ", " << std::setw(2) << PTR[iii].im << "), "; \
std::cout << std::endl; \
}

void fc32_ver() {
std::cout << "Complex Float 32 version" << std::endl;
IppsFIRSpec_32fc* pSpec;
constexpr int tapsLen = 17;
Ipp32fc pTaps[tapsLen];
for (int i = 0; i < tapsLen; ++i)
pTaps[i] = {1, 0};

constexpr int upFactor = 1;
constexpr int upPhase = 0;
constexpr int downFactor = 2;
constexpr int downPhase = 0;
int specSize, bufSize;
constexpr int num_iters = 3;
constexpr int total_input_samples = num_iters * downFactor;
constexpr int total_output_samples = num_iters * upFactor;
constexpr int delay_length = (tapsLen + upFactor - 1) / upFactor;

// Initialize the FIRMR Specitfication Structure
Ipp8u* pBuf;
ippsFIRMRGetSize(tapsLen, upFactor, downFactor, ipp32fc, &specSize, &bufSize);
pSpec = (IppsFIRSpec_32fc*)ippsMalloc_8u(specSize);
pBuf = ippsMalloc_8u(bufSize);
ippsFIRMRInit_32fc(pTaps, tapsLen, upFactor, upPhase, downFactor, downPhase, pSpec);

// Initialize input
Ipp32fc pSrc[total_input_samples];
for (int i = 0; i < total_input_samples; ++i)
pSrc[i] = {1, -1};

// Call the FIRMR function once on the entire signal
Ipp32fc pDst_one_call[total_output_samples];
Ipp32fc pDlyDst_one_call[delay_length];

// Call the FIRMR three times, once for each segment
Ipp32fc pDst_three_calls[total_output_samples];
Ipp32fc pDlyDst_three_calls_0[delay_length];
Ipp32fc pDlyDst_three_calls_1[delay_length];
Ipp32fc pDlyDst_three_calls_2[delay_length];

// One call version
{
ippsFIRMR_32fc(pSrc, pDst_one_call, num_iters, pSpec, nullptr, pDlyDst_one_call, pBuf);
}

// Three call version
{
ippsFIRMR_32fc(pSrc + 0 * downFactor, pDst_three_calls + 0 * upFactor, 1, pSpec, nullptr, pDlyDst_three_calls_0, pBuf);
ippsFIRMR_32fc(pSrc + 1 * downFactor, pDst_three_calls + 1 * upFactor, 1, pSpec, pDlyDst_three_calls_0, pDlyDst_three_calls_1, pBuf);
ippsFIRMR_32fc(pSrc + 2 * downFactor, pDst_three_calls + 2 * upFactor, 1, pSpec, pDlyDst_three_calls_1, pDlyDst_three_calls_2, pBuf);
}

PRINT_PTR("Output One Calls", pDst_one_call, total_output_samples);
PRINT_PTR("Output Three Calls", pDst_three_calls, total_output_samples);
std::cout << std::endl << std::endl;

std::cout << "Delay Destination at End of Calls" << std::endl;
PRINT_PTR("Delay Dst One Calls ", pDlyDst_one_call, delay_length);
PRINT_PTR("Delay Dst Three Calls 2", pDlyDst_three_calls_2, delay_length);

std::cout << std::endl << std::endl;
std::cout << "All delay dest results from three call version" << std::endl;
PRINT_PTR("Delay Dst Three Calls 0", pDlyDst_three_calls_0, delay_length);
PRINT_PTR("Delay Dst Three Calls 1", pDlyDst_three_calls_1, delay_length);
PRINT_PTR("Delay Dst Three Calls 2", pDlyDst_three_calls_2, delay_length);
}

void sc16_ver() {
std::cout << "Complex Int 16 version" << std::endl;
IppsFIRSpec_32fc* pSpec;
constexpr int tapsLen = 17;
Ipp32fc pTaps[tapsLen];
for (int i = 0; i < tapsLen; ++i)
pTaps[i] = {1, 0};

constexpr int upFactor = 1;
constexpr int upPhase = 0;
constexpr int downFactor = 2;
constexpr int downPhase = 0;
int specSize, bufSize;
constexpr int num_iters = 3;
constexpr int total_input_samples = num_iters * downFactor;
constexpr int total_output_samples = num_iters * upFactor;
constexpr int delay_length = (tapsLen + upFactor - 1) / upFactor;

// Initialize the FIRMR Specitfication Structure
Ipp8u* pBuf;
ippsFIRMRGetSize(tapsLen, upFactor, downFactor, ipp32fc, &specSize, &bufSize);
pSpec = (IppsFIRSpec_32fc*)ippsMalloc_8u(specSize);
pBuf = ippsMalloc_8u(bufSize);
ippsFIRMRInit_32fc(pTaps, tapsLen, upFactor, upPhase, downFactor, downPhase, pSpec);

// Initialize input
Ipp16sc pSrc[total_input_samples];
for (int i = 0; i < total_input_samples; ++i)
pSrc[i] = {1, -1};

// Call the FIRMR function once on the entire signal
Ipp16sc pDst_one_call[total_output_samples];
Ipp16sc pDlyDst_one_call[delay_length];

// Call the FIRMR three times, once for each segment
Ipp16sc pDst_three_calls[total_output_samples];
Ipp16sc pDlyDst_three_calls_0[delay_length];
Ipp16sc pDlyDst_three_calls_1[delay_length];
Ipp16sc pDlyDst_three_calls_2[delay_length];

// One call version
{
ippsFIRMR_16sc(pSrc, pDst_one_call, num_iters, pSpec, nullptr, pDlyDst_one_call, pBuf);
}

// Three call version
{
ippsFIRMR_16sc(pSrc + 0 * downFactor, pDst_three_calls + 0 * upFactor, 1, pSpec, nullptr, pDlyDst_three_calls_0, pBuf);
ippsFIRMR_16sc(pSrc + 1 * downFactor, pDst_three_calls + 1 * upFactor, 1, pSpec, pDlyDst_three_calls_0, pDlyDst_three_calls_1, pBuf);
ippsFIRMR_16sc(pSrc + 2 * downFactor, pDst_three_calls + 2 * upFactor, 1, pSpec, pDlyDst_three_calls_1, pDlyDst_three_calls_2, pBuf);
}

PRINT_PTR("Output One Calls", pDst_one_call, total_output_samples);
PRINT_PTR("Output Three Calls", pDst_three_calls, total_output_samples);
std::cout << std::endl << std::endl;

std::cout << "Delay Destination at End of Calls" << std::endl;
PRINT_PTR("Delay Dst One Calls ", pDlyDst_one_call, delay_length);
PRINT_PTR("Delay Dst Three Calls 2", pDlyDst_three_calls_2, delay_length);

std::cout << std::endl << std::endl;
std::cout << "All delay dest results from three call version" << std::endl;
PRINT_PTR("Delay Dst Three Calls 0", pDlyDst_three_calls_0, delay_length);
PRINT_PTR("Delay Dst Three Calls 1", pDlyDst_three_calls_1, delay_length);
PRINT_PTR("Delay Dst Three Calls 2", pDlyDst_three_calls_2, delay_length);
}

 


Output:
Complex Float 32 version
Output One Calls: ( 1, -1), ( 3, -3), ( 5, -5),
Output Three Calls: ( 1, -1), ( 3, -3), ( 5, -5),


Delay Destination at End of Calls
Delay Dst One Calls : ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 2: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1),


All delay dest results from three call version
Delay Dst Three Calls 0: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 1: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 2: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1),


Complex Int 16 version
Output One Calls: ( 1, -1), ( 3, -3), ( 5, -5),
Output Three Calls: ( 1, -1), ( 2, -2), ( 3, -3),


Delay Destination at End of Calls
Delay Dst One Calls : ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 2: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 0, 0), ( 1, -1), ( 0, 0), ( 1, -1), ( 1, -1),


All delay dest results from three call version
Delay Dst Three Calls 0: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 1: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 0, 0), ( 1, -1), ( 1, -1),
Delay Dst Three Calls 2: ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 0, 0), ( 1, -1), ( 0, 0), ( 1, -1), ( 0, 0), ( 1, -1), ( 1, -1),

 

As can be seen calling the function once on the whole signal or three times consecutively gives the same result in the 32fc case. This is as expected.
In the 16sc case, calling the function once gives a matching result to the 32fc version and it is the expected result. However the 16sc three call version does not work as expected giving the wrong output value.
I've found the the Delay Destination array is not filled out properly in the 16sc case. The Delay Destination buffer is not needed when you're processing the whole signal at once hence no difference. But if you're processing a signal in parts then you need the delay destination buffer to hold the state and that is not being done properly in the 16sc case.

0 Kudos
5 Replies
VidyalathaB_Intel
Moderator
1,929 Views

Hi Karol,


Thanks for reaching out to us.

Could you please provide us with the OS environment details and IPP version being used in this case?


Regards,

Vidya.


0 Kudos
kwadolowski
Novice
1,920 Views

Hi Vidya,

 

I am using CentOS 7 with IPP version 2021.6.1

 

Thanks,

Karol

0 Kudos
VidyalathaB_Intel
Moderator
1,846 Views

Hi Karol,


Thanks for providing the details.

The issue raised by you is reproducible from our end as well.

We are working on it and we will get back to you soon.


Regards,

Vidya.


VidyalathaB_Intel
Moderator
1,658 Views

Hi Karol,

 

Thanks for your patience.

We would like to inform you know that the fix for the issue you have raised would be available in IPP 2021.8 release version. You can check and get back to us at any time(by opening a new thread) if the issue still persists.

Please check out the pinned posts here in the forum where you would get an update about the latest oneAPI release.

As the issue is addressed we are closing this thread. Please post a new question if you need any additional assistance from Intel as this thread will no longer be monitored.

 

Regards,

Vidya.

 

kwadolowski
Novice
1,650 Views

Thank you for addressing this issue!

0 Kudos
Reply