Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

2D Convolution performance between 8Bit and 16Bit

KIAN_HUI_T_
Beginner
339 Views

Hi,

For IPP version 8.1 or 8.2, I would like to find out if there should be any big differences in speed performance for  2D Convolution function between Unsigned 8Bit and Signed 16Bit fixed point data? Should the Unsigned 8Bit Convolution function be 2 times or even 4 timer faster  than Signed 16Bit Convolution function?

Thanks!

0 Kudos
3 Replies
Igor_A_Intel
Employee
339 Views

Hi,

2D Convolution use internally 2 algorithms (criterion is based on source sizes) - therefore for FFT based algorithm there will be no any visible difference in performance between 8u and 16s versions as they both use internally 32f 2D FFT; for direct algorithm the 8u data type also will not be significantly faster - for "valid" ROI and src1size=720x480 src2size=8x8 - 16x16 the difference is not greater than 1.5x.

regards Igor.

0 Kudos
KIAN_HUI_T_
Beginner
339 Views

Hi,

Thank you for the explanation.

To probe further, is there a guide as to what are the source sizes that will affect the 2D Convolution to internally choose direct algorithm or FFT based algorithm?

And also, i did some benchmark on the convoultion function between 8u, 16s and 32f with a src1size=1024x768 , src2size=41x41 and found that 32f is significantly slower than 8u and 16s. Am I right to say that with my given source size, the convolution function had internally chosen direct algorithm and not FFT based algorithm?

Thanks!

0 Kudos
Igor_A_Intel
Employee
339 Views

All flavors of 2D convolution use 2 algorithms - FFT based and direct, but have different criterions. In 8.2 you can find in addition to "old" deprecated ConvFull and ConvValid APIs the new one - ippiConv_xx_yyy - this API is more flexible as provides you opportunity to choose direct or FFT yourself. Internal criterions highly depend on CPU architecture (SSE, SSE2....AVX2) and on function flavor and can't be optimal for all HW available at the market - so use please the new API and play with algorithms switching yourself.

regards, Igor

0 Kudos
Reply