Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

__intel_ssse3_rep_memcpy and __intel_new_memset

Pierluigi_D_
Beginner
1,440 Views

Hi there,

I am profiling an application with Intel VTune Amplifier 16. This application was compiled with Intel Fortran compiler from Intel Composer XE-2016 (version  16.0.0).

The profile shows a enormous usage of the __intel_ssse3_rep_memcpy and __intel_new_memset functions (26% of the execution time) and I would like to know exactly what these function do. Can anyone help me?

0 Kudos
1 Reply
TimP
Honored Contributor III
1,440 Views

Such functions could be invoked automatically by Intel compiler, either as a substitution for the standard memset() and memcpy() functions, or by recognizing a for() loop which performs equivalent functionality.  Compiler option Qopt-report:4 may flag where for loops are replaced by these functions.

If using ifort, temporaries generated by array section operations may give rise to fast_memcpy.

A possible reason why such "optimization" (if the copy can't be avoided) might be counter-productive is the case where the strings aren't big enough for the library functions to come up to speed, but the compiler isn't able to recognize that at compile time.  Short strings might be better optimized with for loops plus alignment and length assertions (not to mention arranging the application to reduce the copying of data).

0 Kudos
Reply