- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
If I want to square an Ipp32f image I find that using ippiMul_32f_C1R is many times (~7x) faster than ippiSqr_32f_C1R.
I am evaluating a trial version ippIP AVX (e9) version: 7.1.0 (r36264).
I use:
Intel(R) Xeon(R) CPU E31235 @ 3.20GHz
KMP_AFFINITY=verbose,granularity=core,compact,0,0
1 packages x 4 cores/pkg x 2 threads/core (4 total cores)
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
I observe that ippiSqr uses more cores even with the affinity configuration above.
Thanks,
Pablo
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pablo,
Do you have a code snippet that we could use to replicate this issue? That would help a lot.
Thanks,
Chuck
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pablo,
it might be because of ippiSqr_*_ is not threaded but ippiMul - is threaded.
please check the perf results between these functions when link with serial version of IPP.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page