Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
14 Views

Filter with single diagonal kernel

Jump to solution

 Hello,

 I've been filtering with 8 single width directional kernels.(0, 45, 90,..., 270 and 315 degrees)

 For horizontal and vertical kernels I use FilterRow and FilterColumn.
 But for diagonal directions there's no filter for single width kernel like FilterRow and FilterColumn. 

 So, I've been using kernels like below for diagonal directions.

 0 0 0 0 k4
 0 0 0 k3 0
 0 0 k2 0 0
 0 k1 0 0 0
 k0 0 0 0 0

 Filtering with these kernels is much slower than single row or column filtering.

 How can I boost up the speed for filtering with single width diagonal kernels?
 Any good idea?

 Thanks & regards.

 Dongkyu.

0 Kudos

Accepted Solutions
Highlighted
Employee
14 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

.................. etc.

regards, Igor

View solution in original post

0 Kudos
3 Replies
Highlighted
Employee
15 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

.................. etc.

regards, Igor

View solution in original post

0 Kudos
Highlighted
Employee
14 Views

PS there is one great function for this purpose (I mean case #3):

IPPAPI(IppStatus, ippsAddProductC_32f,       ( const Ipp32f* pSrc, const Ipp32f val, Ipp32f* pSrcDst, int len ))
 

0 Kudos
Highlighted
Beginner
14 Views

 Hi, Igor

 Thanks for reply.

 I'm gonna try #2.

 Regards, Dongkyu.

0 Kudos