Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

performance of gcc and icc

saranyaselvamani
Beginner
413 Views
can anyone suggest out how Icc complier provides more optimization then gcc ....?
when it comes to parallelization icc+openmp and gcc+openmp differs alot ??
how iCC compiler separates threads with openmp??
0 Kudos
1 Reply
TimP
Honored Contributor III
413 Views

gcc and icc show good compatibility in support of OpenMP. In the legacy vectors benchmark http://sites.google.com/site/tprincesite/levine-callahan-dongarra-vectors
icc out-performs gcc by a factor of 2 or more on just2 of 11 OpenMP parallelized cases, when running against the same (Intel) OpenMP run-time library for both compilers. In1 of those, icc vectorizes where gcc doesn't, thanks to support of vectorized math libraries. In another case, both gcc and icc depend on non-repeatable effects to determine whether NHM Loop Stream Detector will work.
-mtune=barcelona depends on the support for unaligned 16-byte loads, so gcc requires this options to optimize for Nehalem CPUs.
In the 8 cases where both gcc and icc achieve combined vector and parallel optimization, icc out-performs gcc by up to 20% on Core i7.
Both compilers require a fairly long list of options for good performance. For Core i7:
icc -O3 -ansi-alias -std=c99 -openmp -xSSE4.2
gcc -O3 -std=c99 -fopenmp -funroll-loops --param max-unroll-times=4 -ffast-math -mtune=barcelona -msse4

Default settings of icc are far more aggressive than those of gcc (with the exception of -ansi-alias), but you shouldn't be comparing them on the basis of defaults.

SPEC 2006 benchmark performance is highly dependent on auto-parallelization, which is not present in released versions of gcc. With a lot of tuning specific to that benchmark, compilers such as icc and Sun C99 achieve big gains which don't correlate to many real applications. As you say, explicit OpenMP is more satisfactory in practice.

The profile feedback of gcc requires profiling for each choice of compiler options, while the icc prof_gen option produces data files which are independent of compiler switches.
Intel OpenMP profiling library shows performance statistics for each parallel region when used with icc, while it will produce only an overall summary for gcc.

BTW, lcc is a completely different compiler, which I don't think gives any attention to optimizations for current CPUs. icc doesn't switch to C++ mode by capitalizing; instead, the spelling changes to icpc.
0 Kudos
Reply