Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

ICC Executable Performance Drop (9.1.049 vs 10.0.023)

gordan
Beginner
349 Views
Hi,

I just downloaded a copy of the v10 compiler, and it would seem that the performance has actually dropped quite noticeably - by about 15% in fact. In my test program (7000 iterations of relatively simple sine curve fitting).

Compiled using v9.1.049 runs in 55 second
Compiled using v10.0.023 runs in 63 second

The compile stage output is reporting the same loops being vectorized.

Is there any reason why v10 would be slower?

The compiler options I am using are:
-fpic -O3 -fomit-frame-pointer -march=pentium3 -mcpu=pentium3 -msse -funroll-loops -mtune=pentium3 -fp-model fast=2 -rcd -xK -ipo -w1 -vec-report3

The machine in question is a Pentium 3, as the options indicate. Is there something new in the way v10 behaves? Is there another optimizer flag I have to add somewhere to get those 15% back?

Thanks.
0 Kudos
2 Replies
JenniferJ
Moderator
349 Views
did you see any loop that is vectorized with 9.1 but not vectorized with 10.0? try with /Qparallel too.
0 Kudos
gordan
Beginner
349 Views
No, this is what surprised me. The compiler output is similar, and in terms of vectorized loops, all the same loops get vectorized. But the code then runs about 15% slower.

I haven't had a chance to test this on more than one machine yet (only have a P3 handy). I'll try it on a Core 2 soon.

The don't think the program I am playing with at the moment would benefit from parallel options. The overhead of spawning threads would likely outweigh any benefits.
0 Kudos
Reply