Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Curious cross correlation timing

Running IPPI6.1.5.054
Testing function:ippiCrossCorrValid_Norm_8u32f_C1R
We use cross-correlation and were investigating the possibility of using a pyramid approach to speed up our process.
So we looked at how long the cross correlation was taking for a fixed source and different sizes of template. To our suprise we found that the timings didn't gradually get faster as the template size was reduced. Instead there were step reductions, usually of around 50%.
As an example: we look for a 600 * 400 template in a 640*480 source: it takes around 100ms
Then we look for a 590 * 400 template in a 640*480 source: it takes around 100ms.
This continues until we look for a 512 * 490 image and suddendly it takes around 50ms!
We then ran a long soak test where our template was varied from 4*4 to 636*476 (in steps of 4 for both axis), and made a large surface plot of the timings. This confirmed that the timings change in discreet steps, which is suggestive of a certain beahviour within the algorithm.
The results of this suggest that trying to optimise our algorithm by using pyramids (so we can progressively refine the template search area) will not have any benefit.
So - we are wondering whether this is as excepted for this feature of IPPI, and would love to know if anyone can offer an explanation as this is quite intriguing.
The chart below is from our spreadsheet. The height is duration and the x and y are the size of template used against a 640*480 source.
Thanks - Jon.
0 Kudos
1 Reply

Please find some comments from our function experts:

CrossCorr is based on 2D FFT internally so its performance changes only when template crosses the next 2^n boundary, within [2^(n-1) 2^n] its always the same as requires the same amount of calculations. Direct algorithm is used for very small templates only.