Worse performance with latest compiler on Mac OS

stx · ‎09-05-2013

Hi!

WeI have been using Intel C++ Compiler version 11 for a long time (on Mac OS), and have been very happy with the performance running DSP code. We are now upgrading our build system, and version 11 is no longer supported on our new Mac.

Unfortunately, upgrading to the latest compiler version (version 13), has a quite bad impact on performance. After some digging it seems like part of this is due to changes in inlining. We have a lot of loops with a bunch of function calls in them, and they no longer inline, even if I use the "static inline" keyword. It seems like I can get around this using the "-ip" flag to the compiler, but still performance is not as good as before.

I used to have the following flags:
-Wno-multichar -Wno-trigraphs -Wall -x c++ -fmessage-length=0 -pipe -fpascal-strings -fasm-blocks -O3 -funroll-loops -funroll-all-loops -fp-model fast -fPIC

I can't use the "-ipo" flag, as I link with other code compiled with other compilers.

Any other things that have changed which can affect performance?

Cheers

SergeyKostrov · ‎09-05-2013

Please try to create a small reproducer. I suspect that there are some issues with inlining because this is the 2nd post related to it ( detected on Windows platforms as well ).

stx · ‎09-05-2013

Alright, interesting.

I have been trying to reproduce this in a shortish piece of code, but I can't. It behaves very well when I bring my inner-loop out of that big mess of other code. Any ideas about what aspects of the code could trigger this kind of problem?

stx · ‎09-05-2013

After some more experimenting with my complicated file which fails, I can conclude that "static inline" doesn't work, but "__attribute__((always_inline))" does!

However, the resulting binary is still significantly slower than the same file compiled using icc version 11, so I guess this only solves part of the problem.

stx · ‎09-05-2013

Somre more info...

If I remove enough random code from my big for-loop, eventually the "static inline" starts to work again! So there seems to be a problem where the compiler turns off inlining (and maybe other optimizations?) if the current scope is too complex or something like that.

stx · ‎09-12-2013

Never mind. Just tried gcc 4.8, and it actually outperforms icc on my code base. I'll go with that instead.