This manual is a guide for porting Intel Threading Building Blocks (TBB)  to Alpha ISA .
Intel Threading Building Blocks (TBB)  is a C++ multithreading runtime library developed by Intel.
Alpha processor  is a 64-bit RISC processor introduced by DEC Corporation.
gem5[3, 4, 5] is an execution-driven full-system architecture simulator. It supports Alpha, x86, ARM, MIPS, and SPARC ISA, while the Alpha version works most stably.
Key issue: if you try to compile TBB for Alpha directly, there will be errors about some atomic primitives which are not implemented, such as __TBB_CompareAndSwap4() and __TBB_CompareAndSwap8(). An possible way is to implement these macros for Alpha architecture, but it is obviously costly. Fortunately, we can use the build-in Generic GCC* Atomics Support. For this, the gcc compiler for Ahpla must be at least v4.3.6, and minimal tbb version is TBB 3.0 Update 7. In fact, youd better use tbb 4.0 update 3 or later. I use the tbb40_20120201oss. To use the build-in Generic GCC* Atomics Support, you should add -DTBB_USE_GCC_BUILTINS option when compiling TBB.
My host OS is Ubuntu 10.04.
Download crosstool-NG and install it. http://ymorin.is-a-geek.org/projects/crosstool
Then generate the Alpha cross-compiler using crosstool-NG:
/location/to/install/crosstool-NG/bin# ./ct-ng menuconfig
Configure the options like this:
Target Architecture - alpha
Variant - ev67
Target OS - linux
binutils version - 2.19.1
gcc version - 4.3.6
Additional supported languages: C++
C library - glibc
glibc version - 2.9
Threading implementation to use - nptl
Just keep other options as default. After configuration, save and exit.
Next build the cross-compiler:
/location/to/install/crosstool-NG/bin# ./ct-ng build
Note that if your gcc version is 4.4.3, there may be errors during building. This is due to gcc 4.4.3, and you can just change to another version of gcc.
The building will last for about one hour. The generated cross-compiler will be at x-tools directory of the system root path, for example, in my system, it is at /root/x-tools directory.
So far, we have obtained an appropriate cross-compiler.
Next, lets cross-compile the TBB library.
First unpack TBB source at TBB_HOME=/home/tbb40_20120201oss.
To specify the location of cross-compiler, modify line 44 of linux.gcc.inc in TBB_HOME/build:
#CPLUS = g++
CPLUS = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-g++
#CONLY = gcc
CONLY = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-gcc
AR = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-ar
RANLIB = /root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-ranlib
Since we will run the binary in simulator, wed better to compile TBB as a static linked library. Modify line 118 of Makefile.tbb in TBB_HOME/build:
$(TBB.DLL): $(TBB.OBJ) $(TBB.RES) tbbvars.sh $(TBB_NO_VERSION.DLL)
# $(LIB_LINK_CMD) $(LIB_OUTPUT_KEY)$(TBB.DLL) $(TBB.OBJ) $(TBB.RES) $(LIB_LINK_LIBS) $(LIB_LINK_FLAGS)
$(AR) rcs libtbb.a $(TBB.OBJ)
Compile the TBB library in TBB_HOME:
# make arch=alpha64 compiler=gcc CXXFLAGS="-DTBB_USE_GCC_BUILTINS" runtime=cc4.3.6_libc2.9
Note that arch must be alpha64, not alpha, or errors will occur.
Test whether the compilation is successful:
# make arch=alpha64 compiler=gcc CXXFLAGS="-DTBB_USE_GCC_BUILTINS" runtime=cc4.3.6_libc2.9 test
If errors about execution occurs, its OK, because the generated binary is for Alpha ISA, it certainly cannot execute on x86 platform.
Lets write a test application Matrix Multiplication (matrixMul_tbb.cpp), and compile it:
#/root/x-tools/alphaev67-unknown-linux-gnu/bin/alphaev67-unknown-linux-gnu-g++ -o mm_tbb matrixMul_tbb.cpp -static -static-libgcc -I/home/tbb40_20120201oss/include/ -L/home/tbb40_20120201oss/build/linux_alpha64_gcc_cc4.3.6_libc2.9_release/ -ltbb -ldl -lrt -lpthread
Then an Alpha binary mm_tbb will be generated. Run it and an error occurs:
bash: ./mm_tbb: cannot execute binary file
Of cause, the Alpha binary cannot execute on x86 platform. Never mind, we will then run it on Alpha platform.
Download gem5 simulator and install it in /GEM5_HOME. Copy Alpha binary mm_tbb to disk image linux-parsec.img. See details in .
Start the simulator in /GEM5_HOME:
# ./build/ALPHA_FS/gem5.opt configs/example/fs.py -n 2
Launch another terminal, and check the booting process of simulator:
/GEM5_HOME/util/term# ./m5term 3456
After booting the system, let us run the application:
Script from M5 readfile is empty, starting bash shell...
# source tbbvars.sh
benchmarks lib mm_tbb splash usr
bin libtbb.so mnt sys var
dev libtbb.so.2 modules tbb
etc linuxrc parsec tbbvars.sh
hello lost+found proc test
iscsi sbin tmp
Using Matrix Sizes: A(100 x 100), B(100 x 100), C(100 x 100)
TBB matrixMul, Throughput = 0.0758 GFlop/s, Time = 0.02638 s, NumOps = 2000000
So far, we have successfully cross-compiled the TBB version of Matrix Multiplication to run on Alpha ISA.
Much gratitude goes to Raf Schietekat, Vladimir Polin (Intel), Sergey Kostrov, Anton Potapov (Intel), Alexey Kukanov (Intel), and Anton Malakhov (Intel) for their valuable advice.
 Intel Threading Building Blocks, http://threadingbuildingblocks.org/
 Alpha Architecture Handbook, http://www.compaq.com/cpq-alphaserver/technology/literature/alphaahb.pdf
 The gem5 Simulator, SIGARCH Computer Architecture News, CAN11
 Binkert, N. L., Dreslinski, R. G., Hsu, L. R., Lim, K. T., Saidi, A. G., and Reinhardt, S. K. The M5 Simulator: Modeling Networked Systems. IEEE Micro 26, 4 (Jul/Aug 2006), 52-60.
 Martin, M. M. K., Sorin, D. J., Beckmann, B. M., Marty, M. R., Xu, M., Alameldeen, A. R., Moore, K. E., Hill, M. D., and Wood, D. A. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. SIGARCH Comput. Archit. News 33, 4 (2005), 92-99.
 Mark Gebhart et al., Running PARSEC 2.1 on M5, The University of Texas at Austin, Technical Report TR-09-32