Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6704 Discussions

ippiSuperSampling_8u_C3R much slower than ippiResizeSqrPixel_8u_C3R using SUPER

Thomas_Jensen1
Beginner
408 Views
I was busy moving from the old ippiResize_8u_C3R to the new ippiResizeSqrPixel_8u_C3R.
I control Interpolation to use IPPI_INTER_SUPER for shrinking and IPPI_INTER_CUBIC for enlarging.
My source image is resolution 3888x4362 RGB, that's some 18MPixels, a quite heavy image to work with.

To my pleasant surprise, ippiResizeSqrPixel is *much* faster than ippiResize when using IPPI_INTER_SUPER.

I then moved on to implement using ippiSuperSampling, since I wanted SuperSampling anyway.
That dedicated function should be faster I thought, and the IPP documentation also stated that supposition.

However, when using ippiSuperSampling, I found it was the same slower speed as ippiResize.
Buggers.

In all cases, those functions are not threaded by me, or by IPP (its not in the threaded-functions list).
They are all SSE-sensitive though.


Any suggestions why the decicated function is much slower than the new general function?


I ran the test on an AMD Phenom X4 @3.2GHz on Windows 7 32-bit using IPP 6.1.2 in my own custom DLL.
0 Kudos
1 Solution
Ying_H_Intel
Employee
408 Views
Hi Thomas, Peter, The two functions use same algorithm, so the quality should be same. The functions ippiSuperSampling is not threaded. The ippiResizeSqrPixel are threaded. So when in multi-threading environment, the ippiResizeSqrPixel is faster. If don't consider the mulit threading, the performance of ippiSuperSampling is better for special wide-used sampling cases (as the manual notes) Note Performance is better if you use the following scaling factors along the x and y axes: 1/2, 2/3, 3/4, 4/5, 5/6, 8/9, 1/3, 2/5, 3/5, 3/7, 4/9, 7/10, 1/4, 2/7, 3/8, 1/8 Regards Ying H.

View solution in original post

0 Kudos
6 Replies
Ying_H_Intel
Employee
408 Views
Hi Thomas,

The ippiResizeSqrPixel funtionis threaded function. (it should be a docomentation bug in 6.1.x andwas fixed in latest version 7.0.2).

Could you tell me which ipp libraries are you linking with your application?
Are they
1) dynamic threaded libraries like
ippi.lib ipps.lib ippcore.lib (ipp*.dll's import library)

or
2) static serial libraries like
ippiemerged.lib ippimerged.lib ippsemerged.lib ippsmerged.lib ippcorel.lib

3) static threaded libraries like
ippiemerged.lib ippimerged_t.lib ippsemerged.lib ippsmerged_t.lib ippcore_t.lib

Best Regards,
Ying H.
0 Kudos
Thomas_Jensen1
Beginner
408 Views
So you are sure ippiResizeSqrPixel is a threaded function since 6.1?
Good to hear. I'm not yet ready to use 7.x, because I have a custom DLL based on 6.x, and 7 has changed a lot.

I use manual waterfall in merged.c, it finds t7 (SSE3).
I call ippInit();
I call ippGetNumCoresOnDie() = 4.
I call set_num_threads(4);
I call ippSetNumThreads(4);

I know that ipp threading works, because DFT is very fast, and samples threading also, because Jpeg2000 uses 100% cpu.


Going back to my original question, what can you say about the fact that ippiSuperSampling seems to be much slower than ippiResizeSqrPixel using interpolation=super ?

Is ippiSuperSampling threaded in 6.1.x?

Can you describe why I should use ippiSuperSampling instead of ippiResizeSqrPixel, when interpolation=super ?

0 Kudos
pvonkaenel
New Contributor III
408 Views
I'd also be interested to know the difference - besides the threading. I've been using the super sampling interpolation mode available in ResizeSqrPixel because of it's speed and quality for downsampling. Is the SuperSampling specific routine different? Is it slower but higher quality?

Thanks,
Peter
0 Kudos
Ying_H_Intel
Employee
409 Views
Hi Thomas, Peter, The two functions use same algorithm, so the quality should be same. The functions ippiSuperSampling is not threaded. The ippiResizeSqrPixel are threaded. So when in multi-threading environment, the ippiResizeSqrPixel is faster. If don't consider the mulit threading, the performance of ippiSuperSampling is better for special wide-used sampling cases (as the manual notes) Note Performance is better if you use the following scaling factors along the x and y axes: 1/2, 2/3, 3/4, 4/5, 5/6, 8/9, 1/3, 2/5, 3/5, 3/7, 4/9, 7/10, 1/4, 2/7, 3/8, 1/8 Regards Ying H.
0 Kudos
Thomas_Jensen1
Beginner
408 Views
OK, that clears it up.
I actually missed the notes about better performance for specific resizing factors.
I only did see the note that SuperSample was better suited for video performance-wise, so that is why I was looking at it.

Anyway, I now know that I will stick to ResizeSqrPixel because it is threaded.
I use any factor for resizing...

Maybe the notes should indicate something else than better suited for video performance-wise, since that is not true for threaded cases...

It is a bit confusing when the threaded-list does not include ResizeSqrPixels (even if it is threaded), and the manual promotes SuperSample (for video), even if is not threaded.

0 Kudos
Ying_H_Intel
Employee
408 Views
Hi Thomas,

Thanks for the feedbacks.

Just for your information,
IPP 7.0.3 release today , the threaded-listhave included theResizeSqrPixels.

Thanks
Ying
0 Kudos
Reply