Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Jacques_d_
Beginner
55 Views

Help tuning icc output for large metaprogram

Hi
 
We have a code using C++11 features to essentially execute a large metaprogram.  The program uses both classic template metaprogramming and constexpr functions to help the compiler process a huge type object and from it instantiate a (hopefully) very efficient code.  There is a lot of redundancy in the tree, and if this is pointed out to the compiler, then there is a hope that it can collapse it all to a very efficient code.
 
We've tested the code on gcc 4.9, nvcc 7.0, clang 3.5 and icc 15.2.164, and currently only clang produces the correct optimised output, i.e. only clang follows the metaprogram "correctly" and optimises the tree fully.  To do this we had to use the following compiler flags
 
             -std=c++11 -ftemplate-depth-512 -fconstexpr-steps=2000000 -fconstexpr-depth=10000 -O3
 
In particular, the -fconstexpr-steps flag was crucial (the value given is sufficient not necessary).  
 
icc 15.2.164 produces huge executables (15MB compared with 63K for clang) and has a runtime about 3x slower than clang. 
 
Now I know there are a lot of knobs and bells on icc, not all of which are documented.  Could you suggest some flags/options we could try to coax icc into doing what we want it to do?   Current icc compiler flags are 
 
            -O3 -std=c++11
 
Regards
Jacques
0 Kudos
1 Reply
Kittur_G_Intel
Employee
55 Views

Hi,
Here's a related article: https://software.intel.com/sites/default/files/managed/f4/1d/code-size-optimization-using-icc.pdf 
See if any of the methods can help?
_Kittur 

Reply