Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Bogdan_B_1
Beginner
158 Views

Time performance - ippiCrossCorrNorm_32f_C1R

Hi,
 
I have compared the time performance of cross-correlation (with normalized coefficients) in Ipp 7.0 vs Ipp 2018. In my test, the old version is 2x - 3x faster. I have disabled the hyper-threding from Bios. I have tried to use versions for different processors (y8, e9, I9). 
VTune (trial) shows that Ipp 2018 version of cross-correlation has no multi-threading compared to Ipp 7.0.
 
Is there something I'm missing? The only difference between the 2 tests is the call for cross-correlation.
 
My processor is an Intel Core i7 4790.
 
---------------------------------------
Test #1: Ipp 2018
 
Lib info:
targetCpu: I9
Name: "ippCV AVX2 (I9)"
Version: "2018.0.3 (r58644)"
BuildDate: "Apr 7 2018"
 
Code:
 
CGenericImage imInput; // 2048 x 2048, 8-bit image loaded from hdd
CGenericImage imInput_32f; // input image converted to Ipp32f
CGenericImage imTemplate_32f; // generated gaussian template, 9x9
CGenericImage imOutput_32f; // score image
 
IppiSize szImage = { imInput.m_nWidth, imInput.m_nHeight };
IppiSize szTemplate = { imTemplate_32f.m_nWidth, imTemplate_32f.m_nHeight };
 
st |= ippiConvert_8u32f_C1R((Ipp8u*)imInput.m_pData, imInput.m_nStep, (Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage);
 
Ipp8u* pBuffer = NULL;
int nBufferSize = 0;
 
st |= ippiCrossCorrNormGetBufferSize(szImage, szTemplate, algType, &nBufferSize);
pBuffer = ippsMalloc_8u(nBufferSize);
 
IppEnum algType = (IppEnum)(ippAlgAuto | ippiROISame | ippiNormCoefficient);
 
st |= ippiCrossCorrNorm_32f_C1R((Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage
, (Ipp32f*)imTemplate_32f.m_pData, imTemplate_32f.m_nStep, szTemplate
, (Ipp32f*)imOutput_32f.m_pData, imOutput_32f.m_nStep, algType, pBuffer);
 
---------------------------------------
Test #2: Ipp 7.0
 
Lib info:
targetCpu: e9
Name: "ippcve9-7.0.dll"
Version: "7.0 build 250.85"
BuildDate: "Nov 27 2011"
 
Code:
 
CGenericImage imInput; // 2048 x 2048, 8-bit image loaded from hdd
CGenericImage imInput_32f; // input image converted to Ipp32f
CGenericImage imTemplate_32f; // generated gaussian template, 9x9
CGenericImage imOutput_32f; // score image
 
IppiSize szImage = { imInput.m_nWidth, imInput.m_nHeight };
IppiSize szTemplate = { imTemplate_32f.m_nWidth, imTemplate_32f.m_nHeight };
 
st = ippiConvert_8u32f_C1R((Ipp8u*)imInput.m_pData, imInput.m_nStep, (Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage);
 
st = ippiCrossCorrSame_NormLevel_32f_C1R((Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage
, (Ipp32f*)imTemplate_32f.m_pData, imTemplate_32f.m_nStep, szTemplate
, (Ipp32f*)imOutput_32f.m_pData, imOutput_32f.m_nStep);
 
---------------------------------------
0 Kudos
3 Replies
Jing_X_Intel
Employee
158 Views

Hi,

May I know how did you link your program against IPP?

Did you link the program against multi-threading version of 2018?

Bogdan_B_1
Beginner
158 Views

Hi,

I have solved the "mystery".

"Intel IPP 8.0 continues the process of deprecating threading inside Intel IPP functions that was started in version 7.1. Though not installed by default, the threaded libraries can be installed so code written with these libraries will still work as before. However, moving to external threading is recommended."

It's funny how I was able to find this information only after this topic was created :)

Jing_X_Intel
Employee
158 Views

Hi,

Bravo.

Good to hear that.

Reply