- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Test program:
"-O2 -xSSE4.2" outputs "len=2000000000 time=286701us"
(icc 11.1.056; Linux amd64; Xeon E5530; DDR3 1067MHz)
The SSE2 version is roughly 6.5% faster. Probably strlen should default to SSE2 code even under -xSSE4.2?
[cpp]#include"-O2 -xSSE2" outputs "len=2000000000 time=268183us"#include #include #define N 2000000000 char s[N + 1]; long read_time() { struct timeval tv; gettimeofday(&tv, NULL); return tv.tv_sec * 1000000l + tv.tv_usec; } int main() { memset(s, 'a', N); long t0 = read_time(); int l = strlen(s); long t1 = read_time(); printf("len=%d time=%ldusn", l, t1 - t0); return 0; } [/cpp]
"-O2 -xSSE4.2" outputs "len=2000000000 time=286701us"
(icc 11.1.056; Linux amd64; Xeon E5530; DDR3 1067MHz)
The SSE2 version is roughly 6.5% faster. Probably strlen should default to SSE2 code even under -xSSE4.2?
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To be a fair test, create a series of strings of varying lenght
0 bytes
1 bytes
2 bytes
3 bytes
4 bytes
5 bytes
8 bytes
9bytes
i.e. power of 2 and power of 2+1
Use nested loops varying the outer loop length to provide reasonable run time.
Now compare run times performance as a function of string length.
Jim Dempsey
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page