Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.

## Filter with single diagonal kernel

Beginner
241 Views

Hello,

I've been filtering with 8 single width directional kernels.(0, 45, 90,..., 270 and 315 degrees)

For horizontal and vertical kernels I use FilterRow and FilterColumn.
But for diagonal directions there's no filter for single width kernel like FilterRow and FilterColumn.

So, I've been using kernels like below for diagonal directions.

0 0 0 0 k4
0 0 0 k3 0
0 0 k2 0 0
0 k1 0 0 0
k0 0 0 0 0

Filtering with these kernels is much slower than single row or column filtering.

How can I boost up the speed for filtering with single width diagonal kernels?
Any good idea?

Thanks & regards.

Dongkyu.

1 Solution
Employee
241 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

.................. etc.

regards, Igor

3 Replies
Employee
242 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

.................. etc.

regards, Igor

Employee
241 Views

PS there is one great function for this purpose (I mean case #3):

IPPAPI(IppStatus, ippsAddProductC_32f,       ( const Ipp32f* pSrc, const Ipp32f val, Ipp32f* pSrcDst, int len ))

Beginner
241 Views

Hi, Igor