Mismatch between legacy90ippiFilterColumn_32f_C1R(ipp legacy) and ippiFilterColumn_32_C1R(ipp7.0)

Zion_H_ · ‎11-26-2018

Hi,

I was using ippiFilterColumn_32_C1R earlier now currently upgrading with ipp2017. I have replaced ippiFilterColumn_32_C1R with legacy90ippiFilterColumn_32_C1R API's and result is not matching, while ippiFilterRow_32_C1R is working fine.

I have attached sample images, kernel value and output results.

Kernels -> gauss.txt & dgauss.txt
Sample images -> lighthouse.tif & houses.tif
output images
- with gauss.txt kernel -> light_gauss_legacy.tif(from ipp legacy) & light_gauss_old.tif(from old ipp7.0)
- with dgauss.txt kernel -> houses_dgauss_legacy.tif(from ipp legacy) & houses_dgauss_old.tif(from old ipp7.0)

Please provide me some workaround to fix this issue and what is the reason for mismatch.

Regards
Ubhay

Pavel_B_Intel1 · ‎11-26-2018

Hi Ubhay,

thank you, we will check it.

Pavel

Igor_A_Intel · ‎11-26-2018

Hi Ubhay,

ippiFilterColumn functionality is supported via ippiFilterBorder function in the latest IPP releases - just use your kernel with width=1 - and optimized code branch for column processing will be used. It is better to switch to the latest IPP releases as they support the modern Intel architectures, while the legacy libraries don't.

regards, Igor.

Zion_H_ · ‎11-27-2018

Hi Igor,

Currently we need to use legacy only as we want backward compatibility with bit match.
I have tried replacing with ippiFilterBorder API and result is mismatching.

Please look into legacy API as backward compatibility is priority here

Regards,
Ubhay

Igor_A_Intel · ‎11-27-2018

Hi Ubhay,

bit-to-bit for floating point functions is very hard to achieve. Internally IPP code has several branches in dependence on kernel size, data alignment and image size. Also this code is different for different architectures - ia32/Intel64, SSE2, SSSE3,...,AVX512 and can have differences between different operating systems (Windows, Linux,MAC OS X). With which IPP version did you perform comparison? (I mean - you've compared legacy with some old IPP version - which one?)

regards, Igor

Zion_H_ · ‎11-27-2018

Hi Igor,

We are running on windows machine(with win-7). As an example ippiFilterRow API gives bit to bit floating point match, while in the same process IppiFilterColumn doesn't. All the input like image, kernel & all other parameter are same.

Old IPP Version
IPP_VERSION_MAJOR 7
IPP_VERSION_MINOR 0
IPP_VERSION_BUILD 205
Image Size is 512 X 512 with 32-bit float data.
kernel size 1 X 51 with 32-bit float data.
Border type is taken as mirror/symmetric.

Regards
Ubhay

Zion_H_ · ‎12-17-2018

Hi Igor,

Were you able to reproduce the bug..?
If yes, Please share the findings and solution with us

Regards
Ubhay

Pavel_B_Intel1 · ‎12-17-2018

Hi Ubhay,

Igor is on vacation now, he should return on next week. Please wait a bit.

Pavel

Zion_H_ · ‎01-21-2019

Hi Igor,

Hope you had good vacation.
If you have found the bug and solution.
Please share with us

Regards
Ubhay

Igor_A_Intel · ‎01-22-2019

Hi Ubhay,

1) please insert several lines of code in your app in order to be sure that the same optimization path works in both libraries:

const IppLibraryVersion *lib;

lib = ippiGetLibVersion();
printf( "CPU : %s\n", lib->targetCpu );
printf( "Name : %s\n", lib->Name );
printf( "Version : %s\n", lib->Version );
printf( "Build date: %s\n", lib->BuildDate );
- the main difference between IPP 7.0.7 and 8.2.legacy is that in 8.2 we added AVX2 support; if you run your app on the AVX2-machine (or higher) - for 7.0.7 e9/g9 code is dispatched (AVX - the top to that moment supported), while for 8.2.legacy - h9/l9. It can be one of the reasons.

2) there is no any difference in code of e9/g9 optimization for ippiFilterColumn_32f_C1R between IPP 7.0.7 and 8.2 legacy, but these 2 releases were built by different compiler versions (7.0.7 - by CompilerXE 12.1, while 8.2.legacy - by 14.0.2). We have to switch to new compiler versions in order to support new architectures and instruction sets. As code of this functions was developed with intrinsics - different compilers may lead to the different cpu instructions order (the same algorithm and logic with slightly different order of calculations; for AVX2 code fma instruction can be used, that leads to the different intermediate rounding in comparison with the mul/add pair).

3) IPP library doesn't provide (and doesn't claim for) bit-to-bit equal FP results for different code branches (different optimizations - SSE2, SSSE3, SSE42, AVX, AVX2, etc.; different input/output data alignment, etc.). Equality for FP functions should be checked with some epsilon that depends on the number of FP operations per point/pixel. For example the weight of the least meaning bit for 32f data type (normalized) is 1.19209289e-07f and this epsilon can't be better than 1.19209289e-07f *0.5 * N, where N is a number of FP operations per pixel.

The summary:

1) please make sure that the same optimized code path works in both cases. If not - it can be switched with ippSetCpuFeatures() function.

2) if the #1 doesn't solve your issue - see #2 and #3 above - in this case we can't help as this is not IPP bug and is a feature of almost all FP code.

regards, Igor