Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.

Any IPP primitive for de-spread function

rohitspandey
Beginner
79 Views
HI,
Is there are any despread function?
I have implemented this in C. Is there any IPP function
for (sym = 0; sym < N_corr; sym++)
{
sum = 0.0f;
for (chip = 0; chip < sf; chip++)
{
sum += (*pSrc++) * (*pW++); //
}
*pDest++ = sum; //
pW -= sf; //
}
Regards
Rohit
0 Kudos
3 Replies
Chao_Y_Intel
Employee
79 Views

Rohit,

The inner loop looks to a dot products, and hot spot of the code

for (chip = 0; chip < sf; chip++){

sum += (*pSrc++) * (*pW++); //

}

If sf is large, the code could be replace with ippsDotProd_ function.

Thanks,C
Chao

igorastakhov
New Contributor II
79 Views
I see only 2 possible variants based on the currently available IPP functionality:
1) already mentioned by Chao:

for (sym = 0; sym < N_corr; sym++){

Ipp32f sum;

ippsDotProd_32f( pSrc, pW, sf, ∑ );
*pDest++ = sum; //
pSrc += sf;
}
2) with temp buf of size sf:
Ipp32f sum;
for( sym = 0; sym < N_corr; sym += sf ){
ippsMul_32f( pSrc, pW, pBuf, sf );
pSrc += sf;
ippsSum_32f( pBuf, sf, ∑, ippAlgHintFast );
*pDest++ = sum;
}
3) based on AddProduct function - guess not so efficient

and I think that the 1st one should be the most efficient in case of reasonable sf, otherwise there is no alternative for C code compiled with Intel compiler - but for efficien code generation (vectorization) you should re-write your code with arrays and indexes and use "ivdep" pragma.

Regards,
Igor

rohitspandey
Beginner
79 Views
Hi,

Thanks for the help. The performance of dotproduct is good for significant sf sizes compared to C.

Regards
Rohit
Reply