- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a little question about IPP speed up.
I thought that the IPP leads to accelerate the calculation. When I compare speed "ippsConcat_8u" with the classic 'strcat' is the speed of 'strcat' the same or even higher.
Maybe I misunderstood principle, or have a bad source code..
There are some comparison speed rates IPP with 'identical' non-IPP function?
Maybe are the strings operation not suitable for speed up demonstration or I do it completely wrong...
Thanks for any advice
Tomas
(Sorry for my English (Google translate is better than me :-) )
Source code and tech details are included below
I am running on openSUSE 11.2 (x86_64)
I have AMD Athlon 64 X2 Dual Core Processor 5200+ (is this the worst fail?)
I am using ipp/6.1.6.063/em64t
CpuType = 42
CpuFeatures = 15
There is the source:
___________________________________
#include
#include "ipp.h"
#include "ippcore.h"
#include "ipps.h"
#include "ippch.h"
#include
# include
IppStatus concat_ipp( void ) {
int i;
Ipp8u string[301] = "";
Ipp8u suffix[4] = "100";
for (i=0;i<100;i++)
{
ippsConcat_8u((Ipp8u*)string, strlen(string), (Ipp8u*)suffix, strlen(suffix),(Ipp8u*)string);
}
//printf("%s\\n",string);
return 0;
}
IppStatus strcat_normal( void ) {
int i;
IppStatus st;
char string[301] = "";
char suffix[4] = "100";
for (i=0;i<100;i++)
{
strcat(string,suffix);
}
//printf("%s\\n",string);
return st;
}
int main()
{
double wtime;
wtime = omp_get_wtime ();
strcat_normal();
wtime = omp_get_wtime () - wtime;
printf("Time strcat: \\t\\t%fs\\n",wtime );
wtime = omp_get_wtime ();
concat_ipp();
wtime = omp_get_wtime () - wtime;
printf("Time ippsConcat_8u: \\t%fs\\n",wtime );
return 0;
}
_____________________________________
Makefile
ipp_lib_patch = /opt/intel/ipp/6.1.6.063/em64t
ipp_static = /opt/intel/ipp/6.1.6.063/em64t/tools/staticlib
strings: strings.o
gcc -o strings strings.o -I $(ipp_lib_patch)/include -L -ltbb -L $(ipp_lib_patch)/sharedlib -lippimx -lippsmx -liomp5 -lpthread -lippchmx -lippcoreem64t
strings.o: strings.c
clear
gcc -c strings.c -I $(ipp_lib_patch)/include -I $(ipp_static)
____________________________________
Output
Time strcat: 0.000008s
Time ippsConcat_8u: 0.000032s
I have a little question about IPP speed up.
I thought that the IPP leads to accelerate the calculation. When I compare speed "ippsConcat_8u" with the classic 'strcat' is the speed of 'strcat' the same or even higher.
Maybe I misunderstood principle, or have a bad source code..
There are some comparison speed rates IPP with 'identical' non-IPP function?
Maybe are the strings operation not suitable for speed up demonstration or I do it completely wrong...
Thanks for any advice
Tomas
(Sorry for my English (Google translate is better than me :-) )
Source code and tech details are included below
I am running on openSUSE 11.2 (x86_64)
I have AMD Athlon 64 X2 Dual Core Processor 5200+ (is this the worst fail?)
I am using ipp/6.1.6.063/em64t
CpuType = 42
CpuFeatures = 15
There is the source:
___________________________________
#include
#include "ipp.h"
#include "ippcore.h"
#include "ipps.h"
#include "ippch.h"
#include
# include
IppStatus concat_ipp( void ) {
int i;
Ipp8u string[301] = "";
Ipp8u suffix[4] = "100";
for (i=0;i<100;i++)
{
ippsConcat_8u((Ipp8u*)string, strlen(string), (Ipp8u*)suffix, strlen(suffix),(Ipp8u*)string);
}
//printf("%s\\n",string);
return 0;
}
IppStatus strcat_normal( void ) {
int i;
IppStatus st;
char string[301] = "";
char suffix[4] = "100";
for (i=0;i<100;i++)
{
strcat(string,suffix);
}
//printf("%s\\n",string);
return st;
}
int main()
{
double wtime;
wtime = omp_get_wtime ();
strcat_normal();
wtime = omp_get_wtime () - wtime;
printf("Time strcat: \\t\\t%fs\\n",wtime );
wtime = omp_get_wtime ();
concat_ipp();
wtime = omp_get_wtime () - wtime;
printf("Time ippsConcat_8u: \\t%fs\\n",wtime );
return 0;
}
_____________________________________
Makefile
ipp_lib_patch = /opt/intel/ipp/6.1.6.063/em64t
ipp_static = /opt/intel/ipp/6.1.6.063/em64t/tools/staticlib
strings: strings.o
gcc -o strings strings.o -I $(ipp_lib_patch)/include -L -ltbb -L $(ipp_lib_patch)/sharedlib -lippimx -lippsmx -liomp5 -lpthread -lippchmx -lippcoreem64t
strings.o: strings.c
clear
gcc -c strings.c -I $(ipp_lib_patch)/include -I $(ipp_static)
____________________________________
Output
Time strcat: 0.000008s
Time ippsConcat_8u: 0.000032s
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
from your linker options I see that you link with generic C code IPP implementation (MX libraries). That basically mean that no SIMD instructions used in IPP implementation. I would recommend you to link with IPP dispatcher libraries to allow IPP to select the best code. Although, I'm not sure if processor you are running on do support SSE4 or later intstruction set.
You should also take into account that IPP concatenation functions is implemented as simple call to IPP copy function. This will cause some call overhead which may diminish performance gain if strings are not big enough.
below is pseudocode for ippsConcat fuunction.
ippsConcat_8u(
const Ipp8u* pSrc1, int len1,
Ipp8u* pSrc2, int len2,
Ipp8u* pDst))
{
ippsCopy_8u(pSrc1, pDst, len1);
ippsCopy_8u(pSrc2, pDst + len1, len2);
return ippStsNoErr;
}
Regards,
Vladimir
from your linker options I see that you link with generic C code IPP implementation (MX libraries). That basically mean that no SIMD instructions used in IPP implementation. I would recommend you to link with IPP dispatcher libraries to allow IPP to select the best code. Although, I'm not sure if processor you are running on do support SSE4 or later intstruction set.
You should also take into account that IPP concatenation functions is implemented as simple call to IPP copy function. This will cause some call overhead which may diminish performance gain if strings are not big enough.
below is pseudocode for ippsConcat fuunction.
ippsConcat_8u(
const Ipp8u* pSrc1, int len1,
Ipp8u* pSrc2, int len2,
Ipp8u* pDst))
{
ippsCopy_8u(pSrc1, pDst, len1);
ippsCopy_8u(pSrc2, pDst + len1, len2);
return ippStsNoErr;
}
Regards,
Vladimir

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page