Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Any IPP primitive for de-spread function

rohitspandey
Beginner
118 Views
HI,
Is there are any despread function?
I have implemented this in C. Is there any IPP function
for (sym = 0; sym < N_corr; sym++)
{
sum = 0.0f;
for (chip = 0; chip < sf; chip++)
{
sum += (*pSrc++) * (*pW++); //
}
*pDest++ = sum; //
pW -= sf; //
}
Regards
Rohit
0 Kudos
3 Replies
Chao_Y_Intel
Employee
118 Views

Rohit,

The inner loop looks to a dot products, and hot spot of the code

for (chip = 0; chip < sf; chip++){

sum += (*pSrc++) * (*pW++); //

}

If sf is large, the code could be replace with ippsDotProd_ function.

Thanks,C
Chao

igorastakhov
New Contributor II
118 Views
I see only 2 possible variants based on the currently available IPP functionality:
1) already mentioned by Chao:

for (sym = 0; sym < N_corr; sym++){

Ipp32f sum;

ippsDotProd_32f( pSrc, pW, sf, ∑ );
*pDest++ = sum; //
pSrc += sf;
}
2) with temp buf of size sf:
Ipp32f sum;
for( sym = 0; sym < N_corr; sym += sf ){
ippsMul_32f( pSrc, pW, pBuf, sf );
pSrc += sf;
ippsSum_32f( pBuf, sf, ∑, ippAlgHintFast );
*pDest++ = sum;
}
3) based on AddProduct function - guess not so efficient

and I think that the 1st one should be the most efficient in case of reasonable sf, otherwise there is no alternative for C code compiled with Intel compiler - but for efficien code generation (vectorization) you should re-write your code with arrays and indexes and use "ivdep" pragma.

Regards,
Igor

rohitspandey
Beginner
118 Views
Hi,

Thanks for the help. The performance of dotproduct is good for significant sf sizes compared to C.

Regards
Rohit
Reply