Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
4992 Discussions

__intel_ssse3_rep_memcpy and __intel_new_memset

Pierluigi_D_
Beginner
778 Views

Hi there,

I am profiling an application with Intel VTune Amplifier 16. This application was compiled with Intel Fortran compiler from Intel Composer XE-2016 (version  16.0.0).

The profile shows a enormous usage of the __intel_ssse3_rep_memcpy and __intel_new_memset functions (26% of the execution time) and I would like to know exactly what these function do. Can anyone help me?

0 Kudos
1 Reply
TimP
Honored Contributor III
778 Views

Such functions could be invoked automatically by Intel compiler, either as a substitution for the standard memset() and memcpy() functions, or by recognizing a for() loop which performs equivalent functionality.  Compiler option Qopt-report:4 may flag where for loops are replaced by these functions.

If using ifort, temporaries generated by array section operations may give rise to fast_memcpy.

A possible reason why such "optimization" (if the copy can't be avoided) might be counter-productive is the case where the strings aren't big enough for the library functions to come up to speed, but the compiler isn't able to recognize that at compile time.  Short strings might be better optimized with for loops plus alignment and length assertions (not to mention arranging the application to reduce the copying of data).

0 Kudos
Reply