Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

ippiSqr function very slow

Pablo_N_
Beginner
242 Views

Hi,

If I want to square an Ipp32f image I find that using ippiMul_32f_C1R is many times (~7x) faster than ippiSqr_32f_C1R.

I am evaluating a trial version ippIP AVX (e9) version: 7.1.0 (r36264).
I use:
Intel(R) Xeon(R) CPU E31235 @ 3.20GHz
KMP_AFFINITY=verbose,granularity=core,compact,0,0
1 packages x 4 cores/pkg x 2 threads/core (4 total cores)
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)


I observe that ippiSqr uses more cores even with the affinity configuration above.

Thanks,
Pablo

0 Kudos
2 Replies
Chuck_De_Sylva
Beginner
242 Views
Pablo, Do you have a code snippet that we could use to replicate this issue? That would help a lot. Thanks, Chuck
0 Kudos
Gennady_F_Intel
Moderator
242 Views
Pablo, it might be because of ippiSqr_*_ is not threaded but ippiMul - is threaded. please check the perf results between these functions when link with serial version of IPP.
0 Kudos
Reply