Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

Any IPP primitive for de-spread function

rohitspandey
Beginner
223 Views
HI,
Is there are any despread function?
I have implemented this in C. Is there any IPP function
for (sym = 0; sym < N_corr; sym++)
{
sum = 0.0f;
for (chip = 0; chip < sf; chip++)
{
sum += (*pSrc++) * (*pW++); //
}
*pDest++ = sum; //
pW -= sf; //
}
Regards
Rohit
0 Kudos
3 Replies
Chao_Y_Intel
Moderator
223 Views

Rohit,

The inner loop looks to a dot products, and hot spot of the code

for (chip = 0; chip < sf; chip++){

sum += (*pSrc++) * (*pW++); //

}

If sf is large, the code could be replace with ippsDotProd_ function.

Thanks,C
Chao

0 Kudos
igorastakhov
New Contributor II
223 Views
I see only 2 possible variants based on the currently available IPP functionality:
1) already mentioned by Chao:

for (sym = 0; sym < N_corr; sym++){

Ipp32f sum;

ippsDotProd_32f( pSrc, pW, sf, ∑ );
*pDest++ = sum; //
pSrc += sf;
}
2) with temp buf of size sf:
Ipp32f sum;
for( sym = 0; sym < N_corr; sym += sf ){
ippsMul_32f( pSrc, pW, pBuf, sf );
pSrc += sf;
ippsSum_32f( pBuf, sf, ∑, ippAlgHintFast );
*pDest++ = sum;
}
3) based on AddProduct function - guess not so efficient

and I think that the 1st one should be the most efficient in case of reasonable sf, otherwise there is no alternative for C code compiled with Intel compiler - but for efficien code generation (vectorization) you should re-write your code with arrays and indexes and use "ivdep" pragma.

Regards,
Igor

0 Kudos
rohitspandey
Beginner
223 Views
Hi,

Thanks for the help. The performance of dotproduct is good for significant sf sizes compared to C.

Regards
Rohit
0 Kudos
Reply