Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.
Announcements
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.

Filter with single diagonal kernel

Dongkyu
Beginner
247 Views

 Hello,

 I've been filtering with 8 single width directional kernels.(0, 45, 90,..., 270 and 315 degrees)

 For horizontal and vertical kernels I use FilterRow and FilterColumn.
 But for diagonal directions there's no filter for single width kernel like FilterRow and FilterColumn. 

 So, I've been using kernels like below for diagonal directions.

 0 0 0 0 k4
 0 0 0 k3 0
 0 0 k2 0 0
 0 k1 0 0 0
 k0 0 0 0 0

 Filtering with these kernels is much slower than single row or column filtering.

 How can I boost up the speed for filtering with single width diagonal kernels?
 Any good idea?

 Thanks & regards.

 Dongkyu.

0 Kudos
1 Solution
Igor_A_Intel
Employee
247 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

.................. etc.

regards, Igor

View solution in original post

3 Replies
Igor_A_Intel
Employee
248 Views

Hello Dongkyu,

It is impossible to support special optimizations for all possible kinds of kernels with some distribution of zeroes. For you particular case I see at least 3 solutions: (1) rotate image, then perform filtering with row or column filter, then rotate it back (guess it will be slower than direct filtering with 2D kernel); (2) - try to use the simple C-loop and Intel compiler - it has very good vectorizer and can generate you very fast code; (3) use roi.width buffer and several IPP function calls in a loop:

ippsMulC_32f(row0,k0,dst,roi.width);

ippsMulC_32f(row1+1,k1,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

ippsMulC_32f(row2+2,k2,buffer,roi.width);

ippsAdd_32f(dst,buffer,dst,roi.width);

.................. etc.

regards, Igor

Igor_A_Intel
Employee
247 Views

PS there is one great function for this purpose (I mean case #3):

IPPAPI(IppStatus, ippsAddProductC_32f,       ( const Ipp32f* pSrc, const Ipp32f val, Ipp32f* pSrcDst, int len ))
 

Dongkyu
Beginner
247 Views

 Hi, Igor

 Thanks for reply.

 I'm gonna try #2.

 Regards, Dongkyu.

Reply