Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
116 Views

Building Python

Hi everyone,
I just wanted to ask if anyone's got any expercience building Python with icc / icpc?
I'm gonna try and post if I run into trouble of any kind or what I had to do to get it to work.
I'll try Python 3.1.1 with computed gotos ;) Btw, I'm running Ubuntu 9.10 amd64 on a Core2Duo T7200 notebook.
Cheers,
nuku
0 Kudos
36 Replies
Highlighted
New Contributor I
104 Views

Hi

The better way is to made test and you call after if an problem
occurred.

Kind regards
0 Kudos
Highlighted
Beginner
104 Views

I was intending to do that anyway ;)
I used the official CPython 3.1.1 tarball without any customizations.
My machine: Intel Core2Duo T7200, 2GB Ram, in a Dell Inspiron 9400 running Ubuntu 9.10 amd64, Kernel 2.6.30-20
So here's what I ran into.
The command lines I used were:
./configure --with-computed-gotos --without-gcc CC=icc CXX=icpc CCFLAGS="-O3"
make CC=icc CXX=icpc CCFLAGS="-O3"
make test CC=icc CXX=icpc CCFLAGS="-O3"
failed (basically, none if the modules built). Same if I leave out all parameters except for CC=icc. So I scanned through the roughly 13,000 lines of output and two things occured to me:
1.) It can't find libimf.so, so I need to pass LD_LIBRARY_PATH manually. (for me, it was /opt/intel/Compiler/11.1/069/lib/intel64/)
2.) _ctypes module uses libffi which uses __int128_t which is not included in 11.1.069.
I decided to fix libimf first. So i did a make again.
Result: everything built except for _ctypes, and it missed some necessary bits to build _dbm, _gdbm, _sqlite3, _tkinter, and readline. I decided not to care about the latter, so if you need them you'll need to find out yourself ;)
Then, I patched ./Modules/_ctypes/libffi/src/x86/ffi64.c exactly as described in step 7 here: http://software.intel.com/en-us/articles/build-firefox-35-with-intel-c-compiler/
So I did a make again, and all it did was build libffi and ctypes, so this one went really quick.
Then, I did a make test to see if it was all right.
There are some test failing, but it looks like 95% of fails are for NaN being treated incorrectly... If you know how to fix this, please drop me a line here. Thanks.
So, basically what you need to do is:
- of course, pass CC=icc and whatever else you want to have
- pass LD_LIBRARY_PATH=/your/path/to/libimf.so/ when configuring (I don't know wheter it's necessary to pass it to make, so I did and it didn't do any harm...)
- patch vour lbffi as described above
0 Kudos
Highlighted
Black Belt
104 Views

The compiler library /lib/intel64/ will be on LD_LIBRARY_PATH if you set up correctly, e.g. by 'source /opt/intel/Compiler/11.1/069/bin/iccvars.sh' (before running configure).
It's unfortunate that icc doesn't accept the point of view of gcc on __int128_t. If you don't want to hack around this, you could perform that part of the build with gcc.
If you want icc to treat NaN "correctly," you must start by setting appropriate options (-fp-model precise, or possibly -fp-model source).
0 Kudos
Highlighted
Beginner
104 Views

Ok, I did some benchmarking with pybench.
Compiler flags used: -O3 -msse3
Comparing gcc and icc with exactly the same flags, GCC is about 5 percent faster! Seehttp://bpaste.net/show/4371/
Comparing it to the standard Ubuntu 2.6.4 python, it's about 10% performance increase, but well, that isn't a fair comparison...
0 Kudos
Highlighted
New Contributor I
104 Views

Hi
About (5%) difference
Use flag (-fast) with Icc if the compile can result ok with..
best regards
0 Kudos
Highlighted
Employee
104 Views

Hey Nuku, I've got a couple questions. First, can I assume that once you sourced the iccvars file (as Tim described) that you were able to build without problems? Also, what kind of machine were you running on, i.e. Linux, Windows, what type of CPU? Were you comparing two python's that you built (i.e. same tarball, one built with gcc and one with icc) or was the gcc one one of the published executables?

I'm interested in trying it myself and seeing if I can reproduce the performance you're seeing.

Thanks!

Dale
0 Kudos
Highlighted
Beginner
104 Views

Hey Dale,
My machine: Intel Core2Duo T7200, 2GB Ram, in a Dell Inspiron 9400 running Ubuntu 9.10 amd64, Kernel 2.6.30-20
I used the official CPython 3.1.1 tarball without any customizations.
Somehow, the iccvars file failed, I tried it multiple times, but this may be a result of me running an unsupported OS... I just added symlinks to all the libs in that dir to /ust/lib and the bin dir to my path, and everything worked just finde then.
And I compiled it myself with GCC, from the same tarball, compiler flags see above.
I'll try bustaf's idea in a minute.
It would be cool if we could share some flags or whatever it takes to make it faster so we can build the ultimately fast python :D
Nuku
0 Kudos
Highlighted
Black Belt
104 Views

It might help if you would tell us your ideas on how you want to speed it up. Without a plan, it seems unlikely that the superior auto-vectorization of icc would contribute, or that we could expect you to apply OpenMP to take advantage of the Intel library.
0 Kudos
Highlighted
Beginner
104 Views

I read through the icc help and chose quite a lot of optimization options.
IPO seems to fail: http://bpaste.net/show/4402/
So the -fast option doesn't work.
The flags I chose are:
-O3 -mssse3 -no-prec-div -static -xSSSE3 -vec -fomit-frame-pointer -unroll-agressive -opt-multi-version-aggressive -vec-guard-write -opt-malloc-options=1 -opt-calloc -mkl -openmp -fp-model=precise -fp-speculation=safe -inline-level=2 -w
Please don't beat me if there's something idiotic in there, I'm a real newbie concerning optimization ;)
Thoughts behind those flags: I need precise floating-point arithmetics and maximum speed, executable size doesn't matter.
I noticed I hate to do a make clean when changing flags, running ./configure again with updated flags wasn't enough.
Leaving out ipo gives me another error:http://bpaste.net/show/4403/
I really don't know what causes this and would be glad if you or someone else could help me ;)
0 Kudos
Highlighted
New Contributor I
104 Views

Hi nuku
Strange that Icc slower that GNU ,but if build result bad with flags optimization parameters is normal.
I have in my hand an fedora 12 (64) with Intel compiler installed (INTEL 64 PROCESSOR(S) MACHINE),
when i have time i made an test for verify.. but i have doubt that i can result better that you.
Also an question about your netbook model:
Your (touch pad) work well or catastrophic (too sensitive) same some other netbook ??
Wireless modules is Ralink type ? or please
can you show me result (lspci > file) ?
I think call this model.
Best regards

0 Kudos
Highlighted
Beginner
104 Views

Hi bustaf,
Yeah I know that optimizations often break code, but it would take years to find out which flag or combination of flags breaks it in this case ;) So basically what I ment is if someone around here has come across a similar situation, what their approach and/or solution to it was ;)
well the notebook I'm using for compiling is over three years old by now and of course isn't sold any more.
It's a Dell Inspiron 9400, Core2Duo T7200, 2GB RAM, 200GB HD, GeForce Go 7900GS, Intel 3945ABG Wireless.
Everything's ok with that machine^^
But as you are referring to netbooks (the Dell is 17", I wouldn't call that a netbook :D), I do also have an Asus UL30A, Core2Duo SU7300, 4GB RAM, with an Atheros wireless card. That one has a weird touchpad which takes some getting used to, but works fine if you're used to it^^ I don't really know what you want, so it's a bit hard^^ Hope I could be of help anyway.
nuku
0 Kudos
Highlighted
New Contributor I
104 Views

Hi nuku
Thank for your answers about hardware.
Essential ,GNU or ICC is that you have improved 10% with an new build source,
is already well compared binary distro by default.
Best regards
0 Kudos
Highlighted
Beginner
104 Views

Well 10% compared to an old Version (2.6.4) is nothing that counts as a success. It may be that Python 3 is just 10% (or more) faster than 2.6 ;)
Cheers
0 Kudos
Highlighted
New Contributor I
104 Views

Hi
Accorded ,maybe is possible here..
In majority task my job i am obligated to build new sources verified. for all lib or utility that can sharing with my new personal programming as added
observed in 99% , result better between 5% / 15 % or greater that compared origin binary distro. but is not true with all.
Observe , rare that version that old fashioned way slower than new ones, rather observed the opposite,with all the new functions actual added or necessaries.
Luckily that processors evolve.
Best regards
0 Kudos
Highlighted
104 Views

The IPO error is explained in the first IPO diagnostic.

ipo: warning #11053: libpython3.1.a is an archive, but has no symbols (this can happen if ar is used where xiar is needed)

You need to get the makefiles to use the Intel-provided xiar and xild for the archiver and linker instead of ar and ld. I would assume the configure script has options to override those.

The other error you posted is a symbol that comes from our OpenMP* runtime library libiomp5.so. You mentioned you had LD_LIBRARY_PATH set, so that should be picked up if you're picking up libimf.so, so not sure why there's a problem there.

Finally I would mention that you want to be a bit discriminate if you can about where you use -fp-model=precise. This does eliminate some optimizations so I would only use it where needed. I might play around with different levels of -O as well, sometimes -O1 or -O2 can be better than -O3 depending on the application. I might also try removing the -inline-level option and let the compiler's default inlining heuristic take over unless you have a good reason for adding that. And just fyi that -vec and -fomit-frame-pointer are already enabled by default, so you don't need to specify them.
0 Kudos
Highlighted
New Contributor I
104 Views

Hi Nuku and All ...
I have download Python-3.1.2 sources and test speedily build with -fast (ipo).
Resulting ok, but require other changed parameter
I am not initiated Python for i extend correctly require i read part of the source or
query from some my friends initiated.

LANG=C;
export LANG
LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/064/lib/intel64;
export LD_LIBRARY_PATH
CC="/opt/intel/Compiler/11.1/069/bin/intel64/icc"
export CC
CXX="/opt/intel/Compiler/11.1/069/bin/intel64/icpc"
export CXX
LD="/opt/intel/Compiler/11.1/069/bin/intel64/xild"
export LD
AR="/opt/intel/Compiler/11.1/069/bin/intel64/xiar"
export AR
Remark:
(xild not used , (shared-intel) flag not used.)
(LD is not env var Makefile) or require -f (file) instructed ???

1] run ./configure
2] open Makefile made the line 61 same:
OPT= -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -shared -fast

make
ipo is ok but I think that you having practice for adding other parameter are absent


Python build finished, but the necessary bits to build these modules were not found:
_dbm _gdbm _sqlite3
_tkinter bz2
To find the necessary bits, look in setup.py in detect_modules() for the module's name.


Failed to build these modules:
_bisect _codecs_cn _codecs_hk
_codecs_iso2022 _codecs_jp _codecs_kr
_codecs_tw _collections _csv
_ctypes _ctypes_test _curses
_curses_panel _elementtree _hashlib
_heapq _json _lsprof
_multibytecodec _multiprocessing _pickle
_random _socket _ssl
_struct _testcapi array
atexit audioop binascii
cmath crypt datetime
fcntl grp itertools
math mmap nis
operator ossaudiodev parser
pyexpat readline resource
select spwd syslog
termios time unicodedata
zlib

running build_scripts
creating build/scripts-3.1
copying and adjusting /usr/src/download/Python-3.1.2/Tools/scripts/pydoc3 -> build/scripts-3.1
copying and adjusting /usr/src/download/Python-3.1.2/Tools/scripts/idle3 -> build/scripts-3.1
copying and adjusting /usr/src/download/Python-3.1.2/Tools/scripts/2to3 -> build/scripts-3.1
changing mode of build/scripts-3.1/pydoc3 from 644 to 755
changing mode of build/scripts-3.1/idle3 from 644 to 755
changing mode of build/scripts-3.1/2to3 from 644 to 755



make install etc ....
machine is an Intel 2 cores
kernel is: 2.6.31.12-174.2.22.fc12.x86_64
Operating system is Fedora 12.

Good luck for finalize correctly
Kind regards

0 Kudos
Highlighted
Beginner
104 Views

Looks like you're missing some library. I had the same problem until I correctly integrated the icc libraries (this should be in your PATH or something similar, or in your LD_LIBRARY_PATH).
0 Kudos
Highlighted
New Contributor I
104 Views

Hi Nuku
You having write:
(So the -fast option doesn't work.)
I have made an test just for help you( as reference) this side. (-fast)
Is Interesting but , I have not actually the job for use Python,...
(Also i have remove ICC to machine of customer...)
Best regards.

0 Kudos
Highlighted
Beginner
104 Views

I'm sorry, bustaf, but I don't quite get what you are trying to tell me...
So you tried it with the "-fast" option, and the above is what you got?
0 Kudos