Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

CPU Usage Problem for FIR Filter.

dolsei0
Beginner
1,249 Views

Hi..

I am executing a C++ code implemented using IPP,for repeat FIR filtering 640*480 times for 512 data.

While executing code on 2way Xeon Quad CPU ( = 8 core), the CPU usage is just 12%.

Just one cpu works.

What is my mistake ?

My system is like this.

Xeon E5320 Quad - 2 way : 8 Core

Windows Server 2003 R2 x86

IPP : 5.3.3

IPP Dynamic Link Library

Visual Studio 6.0 + Intel Compiler 10.1

int nTotalPixel = m_ImageSizeX * m_ImageSizeY;

int i;
int k;
int len = m_nImage;
IppsFIRState_32f* pState;
IppStatus st;
Ipp64f* taps = ippsMalloc_64f(tapslen*sizeof(Ipp64f));
Ipp32f* taps_32f = ippsMalloc_32f(tapslen*sizeof(Ipp32f));
Ipp32f* pSrc = ippsMalloc_32f(len*sizeof(Ipp32f));
Ipp32f* FIRDst = ippsMalloc_32f(len*sizeof(Ipp32f));
Ipp32f* pDL = ippsMalloc_32f(tapslen*sizeof(Ipp32f));
ippsZero_32f(pDL,tapslen);

// COMPUTES TAPSLEN COEFFICIENTS FOR BANDPASS FIR FILTER..
ippsFIRGenBandpass_64f( LowFreq, HighFreq, taps, tapslen, ippWinHamming, ippTrue);
ippsConvert_64f32f(taps,taps_32f,tapslen);

// INITIALIZE FIR FILTER..
ippsFIRInitAlloc_32f(&pState,taps_32f,tapslen, pDL);

BYTE *pImageBuffer;

for(i= 0; i < nTotalPixel; ++i)
{
pImageBuffer = &m_pImageBuffer[i*m_nBuffer];

//GENERATE SOURCE VECTOR
ippsConvert_8u32f(pImageBuffer,pSrc,len);

// FILTER AN INPUT VECTOR
ippsFIR_32f(pSrc, FIRDst, len, pState);

ippsAddC_32f_I(128.,FIRDst,len);
ippsConvert_32f8u_Sfs(FIRDst,pImageBuffer,len,ippRndNear,0);
}

ippsFIRFree_32f(pState);

ippsFree(pSrc);
ippsFree(FIRDst);
ippsFree(taps);
ippsFree(taps_32f);
ippsFree(pDL);

0 Kudos
5 Replies
Vladimir_Dudnik
Employee
1,249 Views

IPP internally estimates size of data to process and turn on threading only when it gives performance gain.

Vladimir

0 Kudos
dolsei0
Beginner
1,249 Views
Thank for your reply.
I just try to FIR Filter for just 512 data.
And repeat this same job 640 * 480 times.
Under this operation, I don't understand that IPP cannot surpport the multi thread..
0 Kudos
Vladimir_Dudnik
Employee
1,249 Views

Do you expect library should launch 8 threads to process 64 elements of data in each?

Are you aware of the cost for thread launch? It takes between 2000 and 3000 processor clocks.

Vladimir

0 Kudos
dolsei0
Beginner
1,249 Views
Oh~
I understand your reply. Thank you for your reply.
Then, does it help to repeat this job using OPENMP,
because IPP function use only one cpu.
0 Kudos
Vladimir_Dudnik
Employee
1,249 Views
If you can filter your image with 2D FIR then you may consider to call ippiFIR filter function. Of course it is possible to launch threads on top of IPP. You may use OpenMP threading or any other threading API.
Vladimir

0 Kudos
Reply