Community
cancel
Showing results for 
Search instead for 
Did you mean: 
coolsandyforyou
Beginner
59 Views

problem with GetCpuClocks

#include
#include

void main()
{
const int SIZE=256;
Ipp8u pSrc[SIZE],pDst[SIZE];
Ipp64u begin,end;

int i;


for(i=0;ipSrc=(Ipp8u)i;

begin=ippGetCpuClocks();
for(i=0;ipDst=pSrc;
end=ippGetCpuClocks();
printf("time taken in c=%ld",(end-begin));

begin=ippGetCpuClocks();
ippsCopy_8u(pSrc,pDst,SIZE);
end=ippGetCpuClocks();
printf("time taken in ipp=%ld",(end-begin));


}

i am surprised to see that time taken in ipp is 6 times larger than in c. Is thr anything wrong with the code?
0 Kudos
3 Replies
Sergey_K_Intel
Employee
59 Views

If you use optimized compiler mode, the compiler may optimize away all the code in your C-loop. So, it will contain two successive calls to ippGetCpuClocks only. Meanwhile, ippsCopy honestly copies all stuff between src and dst.
Try your sample with "-Od" compiler option, i.e. without optimization.

P.S. this behaviour is usual for optimizing compilers. If they see that some variable is not used down the code, compiler doesn't even process that variable.

Regards,
Sergey
coolsandyforyou
Beginner
59 Views

If you use optimized compiler mode, the compiler may optimize away all the code in your C-loop. So, it will contain two successive calls to ippGetCpuClocks only. Meanwhile, ippsCopy honestly copies all stuff between src and dst.
Try your sample with "-Od" compiler option, i.e. without optimization.

P.S. this behaviour is usual for optimizing compilers. If they see that some variable is not used down the code, compiler doesn't even process that variable.

Regards,
Sergey
i kept no optimization still got the same thing,but when itried for other functions like SAD() it worked fine...
Sergey_K_Intel
Employee
59 Views

Quoting - coolsandyforyou
i kept no optimization still got the same thing,but when itried for other functions like SAD() it worked fine...

Hi,
Really, the performance numbers are as you were describing. It looks like the problem is in "cold" instruction cache. Our guys say, that with rare IPP function calls and with short data the pure C/C++ loops are faster than IPP function calls. But, try the following modification of your test (bold lines were added) and you'll see different performance data

#include
#include
void main()
{
const int SIZE=256;
Ipp8u pSrc[SIZE],pDst[SIZE];
Ipp64u begin,end;
Ipp8u pSrc1[SIZE], pDst1[SIZE]; // dumb arrays

int i;

for(i=0;i pSrc=(Ipp8u)i;

ippsCopy_8u(pSrc1,pDst1,SIZE); // instruction cache warming

begin=ippGetCpuClocks();
for(i=0;i pDst=pSrc;
end=ippGetCpuClocks();
printf("time taken in c=%ldn",(end-begin));

begin=ippGetCpuClocks();
ippsCopy_8u(pSrc,pDst,SIZE);
end=ippGetCpuClocks();
printf("time taken in ipp=%ldn",(end-begin));
}

Regards,
Sergey

Reply