Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

compiling gsl 1.15 with intel 11.1

afylot1
Beginner
464 Views
I am trying to compile gsl 1.15 with intel 11.1
I tried the optimization stated in

http://software.intel.com/en-us/forums/showthread.php?t=75549

CFLAGS="-O2 -m64 -march=core2 -mtune=core2 -Wpointer-arith -fno-strict-aliasing "

but I found that the test fails in specfunc

make[2]: Entering directory `/my/path/gsl-1.15/specfunc'
/bin/sh: line 5: 11112 Illegal instruction (core dumped) ${dir}$tst
FAIL: test
==================
1 of 1 test failed
==================

any ideas?
0 Kudos
1 Solution
TimP
Honored Contributor III
464 Views
There's little point in setting -march or -mtune to core2 for x86_64, even in gcc. The defaults (compatibility with nocona and opteron) are fine. mtune is redundant, once march is set to the same value. Does icc work when those options are omitted?
For complex numbers, gsl uses local struct definitions, which are unlikely to be recognized by compilers as optimizable by -msse3.
Perhaps this unusual combination of options is misinterpreted by icc. -march=core2 would imply -mssse3.
It might be safer to add -msse3 to the option list rather than depending on various compilers to set architecture by -march=core2. Recent production core 2 CPUs support also -msse4.1, but early ones (like mine) don't. It should be safe to set -xssse3 if you want code which is compatible with all Core2 CPUs but not with AMD.

View solution in original post

0 Kudos
7 Replies
Georg_Z_Intel
Employee
464 Views
Hello,

what processors are you compiling & running the test on? Illegal instruction sounds like optimizing for a processor generation newer than the one you actually use.

Best regards,

Georg Zitzlsberger
0 Kudos
TimP
Honored Contributor III
465 Views
There's little point in setting -march or -mtune to core2 for x86_64, even in gcc. The defaults (compatibility with nocona and opteron) are fine. mtune is redundant, once march is set to the same value. Does icc work when those options are omitted?
For complex numbers, gsl uses local struct definitions, which are unlikely to be recognized by compilers as optimizable by -msse3.
Perhaps this unusual combination of options is misinterpreted by icc. -march=core2 would imply -mssse3.
It might be safer to add -msse3 to the option list rather than depending on various compilers to set architecture by -march=core2. Recent production core 2 CPUs support also -msse4.1, but early ones (like mine) don't. It should be safe to set -xssse3 if you want code which is compatible with all Core2 CPUs but not with AMD.
0 Kudos
afylot1
Beginner
464 Views
I am compiling on

Intel Xeon CPU 3.00GHz
0 Kudos
Georg_Z_Intel
Employee
464 Views
Hello,

I'm sorry, but that's not enough information. We've lots of generations of Intel Xeon CPUs with 3GHz. Some are far pre-core2 era.

On Linux* you can do a

$ cat /proc/cpuinfo

or look up your processor in our products database here:
http://ark.intel.com/
and send us the link.

Best regards,

Georg Zitzlsberger
0 Kudos
afylot1
Beginner
464 Views
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel Xeon CPU 3.00GHz
stepping : 3
cpu MHz : 3000.000
cache size : 2048 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc pebs bts pni dtes64 monitor ds_cpl cid cx16 xtpr
bogomips : 5985.01
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel Xeon CPU 3.00GHz
stepping : 3
cpu MHz : 3000.000
cache size : 2048 KB
physical id : 3
siblings : 1
core id : 0
cpu cores : 1
apicid : 6
initial apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc pebs bts pni dtes64 monitor ds_cpl cid cx16 xtpr
bogomips : 5985.25
clflush size : 64
cache_alignment : 128
address sizes : 36 bits physical, 48 bits virtual
power management:

0 Kudos
afylot1
Beginner
464 Views
Quoting TimP (Intel)
There's little point in setting -march or -mtune to core2 for x86_64, even in gcc. The defaults (compatibility with nocona and opteron) are fine. mtune is redundant, once march is set to the same value. Does icc work when those options are omitted?
For complex numbers, gsl uses local struct definitions, which are unlikely to be recognized by compilers as optimizable by -msse3.
Perhaps this unusual combination of options is misinterpreted by icc. -march=core2 would imply -mssse3.
It might be safer to add -msse3 to the option list rather than depending on various compilers to set architecture by -march=core2. Recent production core 2 CPUs support also -msse4.1, but early ones (like mine) don't. It should be safe to set -xssse3 if you want code which is compatible with all Core2 CPUs but not with AMD.

I removed mtune and mcore and it worked, now I am compiling it with -xsse3, just to test.

By the way, do you think it is possible to have a more aggressive optimization?

---------------------------------------------

EDIT

-xsse3 works

0 Kudos
TimP
Honored Contributor III
464 Views
The most obvious more aggressive compiler option would be -O3, which may give you optimizations on nested loops, but it's not out of the question that you may not find any way to improve performance, even if you have a meaningful measurement. Your CPU appears to be one which doesn't support SSE4.1; besides, I've run into several cases where compiling for that architecture reduced performance, even though the CPU supported it.
0 Kudos
Reply