Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

Any faster memcpy/memset?

missing__zlw
Beginner
1,027 Views
I wonder whether I can have my own implementation of memset/memcpy to beat the build-in version. I am using Intel compiler, linux platform.
I am thinking using SSE, but I am not sure whether Intel compiler already apply it. Also, I am linking with TC-malloc library.

Thanks.
0 Kudos
2 Replies
SHIH_K_Intel
Employee
1,027 Views
you might want to take a look at the implemenationsin latest glibc(2.13). look under sysdeps/x64_64/multiarch.
Your mileage will vary depending on the metrics you choose and the test data sets youmeasure with.
0 Kudos
TimP
Honored Contributor III
1,027 Views
You could use nm to determine which references to memset and memcpy have been replaced by the __intel_fast_ versions from the icc library. There should be no built-in version with icc, unless you mean those __intel_fast_ versions. As the other response indicated, current glibc versions should be good for most purposes as well. I can't see what your choice of malloc would imply; maybe you mean which functions does your non-standard malloc use. Again, nm should be a useful tool.
Apparently, you're not asking about AVX optimizations; those don't have great importance on the Sandy Bridge implementation, since the hardware splits 256-bit moves into 128-bit pieces. The main issue for big aligned memset/memmove strings is the cutover point to nontemporal, which would be application dependent.
0 Kudos
Reply