Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.

Any faster memcpy/memset?

missing__zlw
Beginner
371 Views
I wonder whether I can have my own implementation of memset/memcpy to beat the build-in version. I am using Intel compiler, linux platform.
I am thinking using SSE, but I am not sure whether Intel compiler already apply it. Also, I am linking with TC-malloc library.

Thanks.
0 Kudos
2 Replies
SHIH_K_Intel
Employee
371 Views
you might want to take a look at the implemenationsin latest glibc(2.13). look under sysdeps/x64_64/multiarch.
Your mileage will vary depending on the metrics you choose and the test data sets youmeasure with.
TimP
Black Belt
371 Views
You could use nm to determine which references to memset and memcpy have been replaced by the __intel_fast_ versions from the icc library. There should be no built-in version with icc, unless you mean those __intel_fast_ versions. As the other response indicated, current glibc versions should be good for most purposes as well. I can't see what your choice of malloc would imply; maybe you mean which functions does your non-standard malloc use. Again, nm should be a useful tool.
Apparently, you're not asking about AVX optimizations; those don't have great importance on the Sandy Bridge implementation, since the hardware splits 256-bit moves into 128-bit pieces. The main issue for big aligned memset/memmove strings is the cutover point to nontemporal, which would be application dependent.
Reply