- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Link Copied

9 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I'm not sure how you realize what instructions this function does use and what it does not. FilterBox function do have processor specific code and you may notice that it has different performance on different processors.

Regards,

Vladimir

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Vladimir

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Where can I find this performance data in my IPP release?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

ps_ippi.exe -f=ippiFilterBox_8u_C1R -YHIGH -dMethod=Manual -dImageSize=1920x1080

-dNumLoops=50000 -N1 -TPP -RPP.csv

ps_ippi.exe -f=ippiFilterBox_8u_C1R -YHIGH -dMethod=Manual -dImageSize=1920x1080

-dNumLoops=50000 -N1 -TSSE3 -RSSE3.csv

first one is for simple assembler optimization and the second for P4 with SSE3.

The result for first one is

ippiFilterBox,8u,C1R,1920x1080,3x3,-,-,-,-,nLps=5000,18.2,pxch,1.16e+004,-

ippiFilterBox,8u,C1R,1920x1080,5x5,-,-,-,-,nLps=5000,18.2,pxch,1.16e+004,-

and for the second

ippiFilterBox,8u,C1R,1920x1080,3x3,-,-,-,-,nLps=5000,8.75,pxch,5.56e+003,-

ippiFilterBox,8u,C1R,1920x1080,5x5,-,-,-,-,nLps=5000,20.5,pxch,1.31e+004,-

for Box filter with window size 3x3 *t7.dll actually run faster: 8.75 ticks per pixel compare to 18.2 tpp. But filtering with bigger windows even loose some speed 20.5 tpp vs 18.2 tpp!

I dont know how to set a window bigger than 90x90 in this tool, looking in disassembler I found three different modes for windows of size 3x3 for windows less than 90x90 and for bigger windows sizes.

Can youpleaseask an actual developer of this IPP Image library to clarify in which cases forwhichwindow sizes this function are optimized. I need to know this!

If Intel do not optimize this function for windowsbiggerthan 3x3 I will try to find a faster implementation for box filter or even try to make my own implementationwhichwill utilize the power of SSE instructions.But if Inteltriedto make it faster and failed in this task, than I will not try to speed this part of filter for now and willfocuson the other bottleneck of a program.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

ippiFilterBox_8u_C1R has low level optimization for kernels 3x3, 5x5 and 7x7. Note that cases for kernels 5x5 and 7x7 are optimized for Core 2 processors and higher.

Regards,

Vladimir

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

-TPP:

ippiFilterBox,8u,C1R,1920x1080,3x3,-,-,-,-,nLps=5000,7.31,pxch,5.42e+003,-

ippiFilterBox,8u,C1R,1920x1080,5x5,-,-,-,-,nLps=5000,7.31,pxch,5.41e+003,-

ippiFilterBox,8u,C1R,1920x1080,3x3,-,-,-,-,nLps=5000,3.72,pxch,2.75e+003,-

ippiFilterBox,8u,C1R,1920x1080,5x5,-,-,-,-,nLps=5000,11.3,pxch,8.36e+003,-

-TSSE41:

ippiFilterBox,8u,C1R,1920x1080,3x3,-,-,-,-,nLps=5000,2.20,pxch,1.63e+003,-

ippiFilterBox,8u,C1R,1920x1080,5x5,-,-,-,-,nLps=5000,11.5,pxch,8.5e+003,-

so... 5x5 still slower in modern optimization libraries... how could it be? uve never tested it? too bad for intel...

Anyway I need to use box filter with windows bigger than 7x7, box filter could be done in linear time and one part of a filter need to sum up andsubtracta row, Ibelievethis could be done faster with SIMD.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Regards,

Vladimir

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page