I'm evaluating the performance of IPP 7.0, which is giving us great improvements over our current (Blitz) baseline.
However, I'm seeing some strange IPP vs IPP benchmarking results (see below). Win32 is performing (substantially) better than x64.
Setup: VS2010, Corei7 2.0 GHz, MSVC optimization options enabled. Times are in seconds, for 2000 iterations.
What might cause this? Are the IPP 7.0 routines not optimized for x64???
Conv. Type Main Filter TypeWIN32 x64 Routines
2D 190x190 9x9 float 0.87 1.61 ippiConvFull_32f_C1R
short0.62 1.58 ippiConvFull_16s_C1R
Separable 2x1D 190x190 2 9x1 float 0.23 0.33 ippiFilterColumn_32f_C1R
short0.18 0.48 ippiFilterColumn_16u_C1R