Porting scripting languages to MIC

Anwar_Ludin · ‎06-25-2013

I would be interested in using scripting languages such as Python and R on the MIC platform. In order to do that I would need to compile binaries for the MIC architecture. I believe the easiest path would be to compile from source using MPSS/gcc. However from my understanding Intel recommends not to use MPSS in order to build "application software" because vector instructions are not supported by the MIC port of gcc. My question is how much of an impact would compiling with gcc would have on building executables of a scripting language such as Python or R?

Thanks!

Anwar

TimP · ‎06-25-2013

I haven't heard of any testing of python on MIC with use of vector instructions. Those would certainly be advantageous if you had a vector floating point intensive workload, but such would usually be run under Fortran/c/C++. Note that MIC requires both vectorization and threaded parallelization for competitive performance; not characteristics of most scripting languages.

According to my limited understanding of R, we wouldn't expect a native MIC build to perform competitively regardless of whether it were built with icc or gcc. Web references indicate that user floating point intenstive functions are usually compiled into a shared library. For those wishing to use MKL on MIC for an R application running on host, the MKL offload capabilities would seem worthy of consideration.

Anwar_Ludin · ‎06-25-2013

Hi Tim,

Thanks for your answer. Yes indeed scripting languages are not for high performance computing :). What I want to do is to apply a user defined function to data in parallel in the functional programming sense. So for example a user would pass a function/data pair written in Python or R to a native process running on the Xeon Phi. The data can be located on the host's filesystem. The native mic process would then evaluate the function in parallel using the many threads available and write back the result to the host filesystem. What I want to implement is a form of mapreduce. In order to achieve this, I need to recompile the Python/R interpreter/libraries in order to evaluate the function in parallel.

Regards,

Anwar

Anwar_Ludin · ‎06-25-2013

Another question, I want to install the gcc compiler tools only. Which rpm package should I use in the MPSS distribution?

James_C_Intel2 · ‎06-25-2013

Before deciding that you need to use gcc, have you tried using "icc -mmic"? It should be able to handle gcc extensions (such as inline asm), and it is certainly the compiler that has most tuning for KNC...

Anwar_Ludin · ‎06-25-2013

James,

Yes ideally I should use icc -mmic but I was being lazy and opted for the easy path :) OK I'm gonna bite the bullet and try to build a native R interpreter using the intel toolchain, but then I will need some help! :)

R like most open source tools comes with a configure script. So I tried something like:

source /opt/intel/composer_xe_2013/bin/compilervars.sh intel64

./configure --enable-R-static-lib CXX=icpc CC=icc CFLAGS=-mmic CXXFLAGS=-mmic LDFLAGS=-mmic

And I guet...

configure: error: cannot run C compiled programs.

If you meant to cross compile, use `--host'.

Obviously I'm cross compiling here and the Xeon Phi host does not exist. As the configure script has been generated by the gnu autoconf tools I suspect you need to update the tools in order to add the Xeon Phi as a host.

Cross compiling R using the Intel compiler tools would completely make my day! :)

Regards,

Anwar

TimP · ‎06-25-2013

I'd be surprised if there's more than one gcc in the MPSS rpm set. It's present by default in the host side of the installation. I'd guess it's in the gpl .rpm.

Anwar_Ludin · ‎06-25-2013

OK I think i ve messed up things big time I installed the intel-mic-gpl-2.1.6720-13.el6.x86_64.rpm on the xeon but just realized that the package is for cross compiling from host to xeon and not for native compiling...how do i uninstall it as there does not seem to be an option for that...

TimP · ‎06-25-2013

You got a convenient hint from the R build framework; it looks like their suggestion to add --host to your configure options is applicable (e.g. --host=<whatever your host gcc says it was built for>) as icc -mmic et al. are in fact cross compilers.

The MPSS readme instructions tell you the rpm install and removal commands. You probably do want to keep that rpm installed until you go to a new MPSS.

James_C_Intel2 · ‎06-26-2013

" just realized that the package is for cross compiling from host to xeon and not for native compiling". AFAIK I know there are no compilers that run on the Xeon Phi. All the compilers (including the gcc used for the kernel build) are cross compilers. So you have to confront the issues of doing a cross-build no matter which compiler you use. (The gcc compiler doesn't magically avoid them :-().

There is discussion http://software.intel.com/en-us/forums/topic/391645 on the issue of cross-building where the build system is trying to execute binaries that it has just built. If the build system isn't set up for doing a cross-build, some of the tips there may be useful.

Anwar_Ludin · ‎06-26-2013

James,

Thanks for the link...Indeed, I m basically running in the same kind of issues as the ones mentioned in the other discussion. In some sense this is reassuring :) But also I think Intel should consider providing a step by step guide to cross compiling software based on the gnu autotools toolchain (using the intel compilers) as most open source software relies on them. A big plus would also be provide a repository of open source packages ported to the Xeon Phi. I know that the boost libraries have been ported, but things like xml parsers, libcurl, etc... would also be quite useful when building native mic apps.

Regards,

Anwar

Kevin_D_Intel · ‎06-26-2013

Here are some related discussions. The first may offer a method for Python and R.

Cross-compilation Challenges references a forum thread containing a customer provided method
Cross-compilation for Intel® Xeon Phi™ Coprocessor with CMake

Kevin_D_Intel · ‎06-26-2013

Here's another excellent and too often overlooked resource (with apologies to Michael) of Configuring Intel® Xeon Phi™ coprocessors inside a cluster.

Look to sections:

Native compiler for Intel Xeon Phi Coprocessor
Compiling native GNU tools

Anwar_Ludin · ‎06-26-2013

Kevin, James, Tim,

Thanks for the links to the docs...I m gonna need to read all of that stuff so that i get a better idea on how to proceed. I also got in touch with the r-dev mailing list and they gave me a couple of pointers on how R should be compiled from source. Perhaps it would be a good idea to setup a github repository with software ported to the Intel Xeon Phi? This could benefit the whole community

Anwar_Ludin · ‎06-26-2013

Realized that R also requires a fortran compiler in order to build. Decided to take a step backwards and first make sure I could build R using Intel compilers on intel64 architecture

./configure CC=icc F77=ifort CXX=icpc FC=ifort

------

The result of the configure script is:

R is now configured for x86_64-unknown-linux-gnu

Source directory: .
Installation directory: /usr/local

C compiler: icc -std=gnu99 -g -O2 -std=c99
Fortran 77 compiler: ifort -g -O2

C++ compiler: icpc -g -O2
Fortran 90/95 compiler: ifort -g
Obj-C compiler:

Interfaces supported: X11, tcltk
External libraries: readline
Additional capabilities: PNG, JPEG, NLS, cairo
Options enabled: shared BLAS, R profiling

Recommended packages: yes

-------

make
cd bin
./R
-------

./R

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
>
-----

OK at least I can build R using Intel compilers. No trying to cross compile:

./configure CC=icc F77=ifort CXX=icpc FC=ifort CFLAGS=-mmic FLAGS=-mmic FCFLAGS=-mmic LDFLAGS=-mmic SHLIB_CXXLD=icpc --host=x86_64-unknown-linux --with-readline=no

------

checking for dummy main to link with Fortran 77 libraries... none
checking for Fortran 77 name-mangling scheme... unknown
configure: WARNING: unknown Fortran name-mangling scheme
checking whether ifort appends underscores to external names... unknown
configure: error: cannot use Fortran

-----

Hum one step forwards...2 step backwards....

Vladimir_Dergachev · ‎07-10-2013

TimP (Intel) wrote:

I haven't heard of any testing of python on MIC with use of vector instructions. Those would certainly be advantageous if you had a vector floating point intensive workload, but such would usually be run under Fortran/c/C++. Note that MIC requires both vectorization and threaded parallelization for competitive performance; not characteristics of most scripting languages.

According to my limited understanding of R, we wouldn't expect a native MIC build to perform competitively regardless of whether it were built with icc or gcc. Web references indicate that user floating point intenstive functions are usually compiled into a shared library. For those wishing to use MKL on MIC for an R application running on host, the MKL offload capabilities would seem worthy of consideration.

R is an interesting language. Under the hood it is LISP-like but with vector extensions. You are right that the sequential code will run particularly slow, but the vector operations should be easy to vectorize. It would, of course, perform best if someone went in and put #pragma omp for in all the right places (including the garbage collector).

best

Vladimir Dergachev