Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Time performance - ippiCrossCorrNorm_32f_C1R

Bogdan_B_1
Beginner
742 Views
Hi,
 
I have compared the time performance of cross-correlation (with normalized coefficients) in Ipp 7.0 vs Ipp 2018. In my test, the old version is 2x - 3x faster. I have disabled the hyper-threding from Bios. I have tried to use versions for different processors (y8, e9, I9). 
VTune (trial) shows that Ipp 2018 version of cross-correlation has no multi-threading compared to Ipp 7.0.
 
Is there something I'm missing? The only difference between the 2 tests is the call for cross-correlation.
 
My processor is an Intel Core i7 4790.
 
---------------------------------------
Test #1: Ipp 2018
 
Lib info:
targetCpu: I9
Name: "ippCV AVX2 (I9)"
Version: "2018.0.3 (r58644)"
BuildDate: "Apr 7 2018"
 
Code:
 
CGenericImage imInput; // 2048 x 2048, 8-bit image loaded from hdd
CGenericImage imInput_32f; // input image converted to Ipp32f
CGenericImage imTemplate_32f; // generated gaussian template, 9x9
CGenericImage imOutput_32f; // score image
 
IppiSize szImage = { imInput.m_nWidth, imInput.m_nHeight };
IppiSize szTemplate = { imTemplate_32f.m_nWidth, imTemplate_32f.m_nHeight };
 
st |= ippiConvert_8u32f_C1R((Ipp8u*)imInput.m_pData, imInput.m_nStep, (Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage);
 
Ipp8u* pBuffer = NULL;
int nBufferSize = 0;
 
st |= ippiCrossCorrNormGetBufferSize(szImage, szTemplate, algType, &nBufferSize);
pBuffer = ippsMalloc_8u(nBufferSize);
 
IppEnum algType = (IppEnum)(ippAlgAuto | ippiROISame | ippiNormCoefficient);
 
st |= ippiCrossCorrNorm_32f_C1R((Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage
, (Ipp32f*)imTemplate_32f.m_pData, imTemplate_32f.m_nStep, szTemplate
, (Ipp32f*)imOutput_32f.m_pData, imOutput_32f.m_nStep, algType, pBuffer);
 
---------------------------------------
Test #2: Ipp 7.0
 
Lib info:
targetCpu: e9
Name: "ippcve9-7.0.dll"
Version: "7.0 build 250.85"
BuildDate: "Nov 27 2011"
 
Code:
 
CGenericImage imInput; // 2048 x 2048, 8-bit image loaded from hdd
CGenericImage imInput_32f; // input image converted to Ipp32f
CGenericImage imTemplate_32f; // generated gaussian template, 9x9
CGenericImage imOutput_32f; // score image
 
IppiSize szImage = { imInput.m_nWidth, imInput.m_nHeight };
IppiSize szTemplate = { imTemplate_32f.m_nWidth, imTemplate_32f.m_nHeight };
 
st = ippiConvert_8u32f_C1R((Ipp8u*)imInput.m_pData, imInput.m_nStep, (Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage);
 
st = ippiCrossCorrSame_NormLevel_32f_C1R((Ipp32f*)imInput_32f.m_pData, imInput_32f.m_nStep, szImage
, (Ipp32f*)imTemplate_32f.m_pData, imTemplate_32f.m_nStep, szTemplate
, (Ipp32f*)imOutput_32f.m_pData, imOutput_32f.m_nStep);
 
---------------------------------------
0 Kudos
3 Replies
Jing_Xu
Employee
742 Views

Hi,

May I know how did you link your program against IPP?

Did you link the program against multi-threading version of 2018?

0 Kudos
Bogdan_B_1
Beginner
742 Views

Hi,

I have solved the "mystery".

"Intel IPP 8.0 continues the process of deprecating threading inside Intel IPP functions that was started in version 7.1. Though not installed by default, the threaded libraries can be installed so code written with these libraries will still work as before. However, moving to external threading is recommended."

It's funny how I was able to find this information only after this topic was created :)

0 Kudos
Jing_Xu
Employee
742 Views

Hi,

Bravo.

Good to hear that.

0 Kudos
Reply