Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Porting TBB onto Alpha Platform

chenxuhao
Beginner
365 Views

This manual is a guide for porting Intel Threading Building Blocks (TBB) [1] to Alpha ISA [2].

1. Background

1.1. TBB

Intel Threading Building Blocks (TBB) [1] is a C++ multithreading runtime library developed by Intel.

1.2. Alpha ISA

Alpha processor [2] is a 64-bit RISC processor introduced by DEC Corporation.

1.3. gem5 simulator

gem5[3, 4, 5] is an execution-driven full-system architecture simulator. It supports Alpha, x86, ARM, MIPS, and SPARC ISA, while the Alpha version works most stably.

2. Porting TBB onto Alpha

Key issue: if you try to compile TBB for Alpha directly, there will be errors about some atomic primitives which are not implemented, such as __TBB_CompareAndSwap4() and __TBB_CompareAndSwap8(). An possible way is to implement these macros for Alpha architecture, but it is obviously costly. Fortunately, we can use the build-in Generic GCC* Atomics Support. For this, the gcc compiler for Ahpla must be at least v4.3.6, and minimal tbb version is TBB 3.0 Update 7. In fact, youd better use tbb 4.0 update 3 or later. I use the tbb40_20120201oss. To use the build-in Generic GCC* Atomics Support, you should add -DTBB_USE_GCC_BUILTINS option when compiling TBB.

2.1. Obtaining Cross-compiler

My host OS is Ubuntu 10.04.

Download crosstool-NG and install it. http://ymorin.is-a-geek.org/projects/crosstool

./configure --prefix=/location/to/install/crosstool-NG

make

make install

Then generate the Alpha cross-compiler using crosstool-NG:

/location/to/install/crosstool-NG/bin# ./ct-ng menuconfig

Configure the options like this:

Target options

Target Architecture - alpha

Variant - ev67

Operating System

Target OS - linux

Binary utilities

binutils version - 2.19.1

C compiler

gcc version - 4.3.6

Additional supported languages: C++

C-library

C library - glibc

glibc version - 2.9

Threading implementation to use - nptl

Just keep other options as default. After configuration, save and exit.

Next build the cross-compiler:

/location/to/install/crosstool-NG/bin# ./ct-ng build

Note that if your gcc version is 4.4.3, there may be errors during building. This is due to gcc 4.4.3, and you can just change to another version of gcc.

The building will last for about one hour. The generated cross-compiler will be at x-tools directory of the system root path, for example, in my system, it is at /root/x-tools directory.

So far, we have obtained an appropriate cross-compiler.

2.2. Cross-compile TBB Library

Next, lets cross-compile the TBB library.

First unpack TBB source at TBB_HOME=/home/tbb40_20120201oss.

To specify the location of cross-compiler, modify line 44 of linux.gcc.inc in TBB_HOME/build:

#CPLUS = g++

CPLUS = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-g++

#CONLY = gcc

CONLY = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-gcc

AR = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-ar

RANLIB = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-ranlib

Since we will run the binary in simulator, wed better to compile TBB as a static linked library. Modify line 118 of Makefile.tbb in TBB_HOME/build:

$(TBB.DLL): $(TBB.OBJ) $(TBB.RES) tbbvars.sh $(TBB_NO_VERSION.DLL)

# $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(TBB.DLL) $(TBB.OBJ) $(TBB.RES) $(LIB_LINK_LIBS) $(LIB_LINK_FLAGS)

$(AR) rcs libtbb.a $(TBB.OBJ)

$(RANLIB) libtbb.a

Compile the TBB library in TBB_HOME:

# make arch=alpha64 compiler=gcc CXXFLAGS="-DTBB_USE_GCC_BUILTINS" runtime=cc4.3.6_libc2.9

Note that arch must be alpha64, not alpha, or errors will occur.

Test whether the compilation is successful:

# make arch=alpha64 compiler=gcc CXXFLAGS="-DTBB_USE_GCC_BUILTINS" runtime=cc4.3.6_libc2.9 test

If errors about execution occurs, its OK, because the generated binary is for Alpha ISA, it certainly cannot execute on x86 platform.

2.3. Cross-compile TBB application

Lets write a test application Matrix Multiplication (matrixMul_tbb.cpp), and compile it:

#/root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-g++ -o mm_tbb matrixMul_tbb.cpp -static -static-libgcc -I/home/tbb40_20120201oss/include/ -L/home/tbb40_20120201oss/build/linux_alpha64_gcc_cc4.3.6_libc2.9_release/ -ltbb -ldl -lrt -lpthread

Then an Alpha binary mm_tbb will be generated. Run it and an error occurs:

# ./mm_tbb

bash: ./mm_tbb: cannot execute binary file

Of cause, the Alpha binary cannot execute on x86 platform. Never mind, we will then run it on Alpha platform.

2.4. Run the Binary in Simulator

Download gem5 simulator and install it in /GEM5_HOME. Copy Alpha binary mm_tbb to disk image linux-parsec.img. See details in [6].

Start the simulator in /GEM5_HOME:

# ./build/ALPHA_FS/gem5.opt configs/example/fs.py -n 2

Launch another terminal, and check the booting process of simulator:

/GEM5_HOME/util/term# ./m5term 3456

After booting the system, let us run the application:

...

mounting filesystems...

loading script...

Script from M5 readfile is empty, starting bash shell...

# source tbbvars.sh

# ls

benchmarks lib mm_tbb splash usr

bin libtbb.so mnt sys var

dev libtbb.so.2 modules tbb

etc linuxrc parsec tbbvars.sh

hello lost+found proc test

iscsi sbin tmp

# ./mm_tbb

Using Matrix Sizes: A(100 x 100), B(100 x 100), C(100 x 100)

start timing

TBB matrixMul, Throughput = 0.0758 GFlop/s, Time = 0.02638 s, NumOps = 2000000

Test passed!

So far, we have successfully cross-compiled the TBB version of Matrix Multiplication to run on Alpha ISA.

Acknowledgement

Much gratitude goes to Raf Schietekat, Vladimir Polin (Intel), Sergey Kostrov, Anton Potapov (Intel), Alexey Kukanov (Intel), and Anton Malakhov (Intel) for their valuable advice.

Reference

[1] Intel Threading Building Blocks, http://threadingbuildingblocks.org/

[2] Alpha Architecture Handbook, http://www.compaq.com/cpq-alphaserver/technology/literature/alphaahb.pdf

[3] The gem5 Simulator, SIGARCH Computer Architecture News, CAN11

[4] Binkert, N. L., Dreslinski, R. G., Hsu, L. R., Lim, K. T., Saidi, A. G., and Reinhardt, S. K. The M5 Simulator: Modeling Networked Systems. IEEE Micro 26, 4 (Jul/Aug 2006), 52-60.

[5] Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4 (2005), 92-99.

[6] Mark Gebhart et al., Running PARSEC 2.1 on M5, The University of Texas at Austin, Technical Report TR-09-32

0 Kudos
0 Replies
Reply