- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
We have a code using C++11 features to essentially execute a large metaprogram. The program uses both classic template metaprogramming and constexpr functions to help the compiler process a huge type object and from it instantiate a (hopefully) very efficient code. There is a lot of redundancy in the tree, and if this is pointed out to the compiler, then there is a hope that it can collapse it all to a very efficient code.
We've tested the code on gcc 4.9, nvcc 7.0, clang 3.5 and icc 15.2.164, and currently only clang produces the correct optimised output, i.e. only clang follows the metaprogram "correctly" and optimises the tree fully. To do this we had to use the following compiler flags
-std=c++11 -ftemplate-depth-512 -fconstexpr-steps=2000000 -fconstexpr-depth=10000 -O3
In particular, the -fconstexpr-steps flag was crucial (the value given is sufficient not necessary).
icc 15.2.164 produces huge executables (15MB compared with 63K for clang) and has a runtime about 3x slower than clang.
Now I know there are a lot of knobs and bells on icc, not all of which are documented. Could you suggest some flags/options we could try to coax icc into doing what we want it to do? Current icc compiler flags are
-O3 -std=c++11
Regards
Jacques
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Here's a related article: https://software.intel.com/sites/default/files/managed/f4/1d/code-size-optimization-using-icc.pdf
See if any of the methods can help?
_Kittur

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page