Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

ICC 10.1 vs GCC 4.1

nate_u
Beginner
497 Views
I am encountering a performance issue. The same source produces 15-20% slower code on ICC 10.1.13 then GCC 4.1.0

My the mainloop is basicly one switch stament that handles 64 bit data. It is an emulator of some old hardware.
The benchmarking expert gave me some help with emon. He said that the CPI (cycles per instruction) with icc was much lower but to do the same work it required a lot more instructions. So the net was a marked loss of performance.

ICC -O3 -ipo -xT -static -no-ansi-alias
GCC -O3 -fno-strict-aliasing

What else can I try? Am I missing something?

0 Kudos
4 Replies
TimP
Honored Contributor III
497 Views
Your result is consistent with mine. gcc performs code straightening optimizations on switch with normal optimization. I think the best chance with icc is to change the code so as to favor the frequently taken branch (put it first, maybe with if..else.. ), or try profile guided optimization (prof_gen/prof_use profiling).

0 Kudos
Dale_S_Intel
Employee
497 Views
As Tim suggested, using profile guided optimization could be quite helpful for a case like this (-prof-gen, -prof-use). This would help ensure that the most frequent cases are handled most efficiently. If you would like to post a sample I'd be happy to look at it and see if there's anything else to do.

Dale

0 Kudos
nate_u
Beginner
497 Views
I tried the profiled guided optimizations and it closes the gap but still results in slower code. The code is about 10% slower which is an improvement over the 15%.

I collected some logs on my major switch statement. There about 30 cases, and they are fairly even distributed, 10% to 1%. Would that make a difference?
0 Kudos
TimP
Honored Contributor III
497 Views
Yes, I suspect the even distribution would account for loop straightening coming out more effective than the use of PGO to sort the cases by priority.
0 Kudos
Reply