Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Port to PowerPC + Linux (Cell B.E.)

Ryan_B_1
Beginner
622 Views
After two unsuccessful attempts to port TBB to PowerPC (64-bit) running Linux, first with tbb20_20080207, then tbb21_20080622 (both sometimes hanging/spinning, nondeterministically, in one of the tests), I applied Raf Schietekat's "Atomic" mods (Ver. 8.0) to tbb21_20080605, made some straightforward changes to the Makefiles and a couple of changes to his file gcc_power.h (one non-essential), and it seems to work. This is on a Cell B.E. processor. If there's any interest I'll upload the diffs.

My own impression, and from the report of a colleague who had apparently the same problem with an unmodified version of TBB on PowerPC + MacOS, is that Raf fixed a problem with some memory access/mutual exclusion issue with PowerPC.
0 Kudos
8 Replies
RafSchietekat
Valued Contributor III
622 Views
I would like to forget all about version "8.0" (2008-07-14) myself (it should work, but it's likely to be slower). If you add your diffs here, I can integrate them on my side (next version before the end of next week at the latest).
0 Kudos
Alexey-Kukanov
Employee
622 Views

keidavis,

yes we are interested. Please use the TBB contribution pageto submit your work to the project. You might as well send it to Raf or attach to the forum, but it is important that you, the author,contribute it via the official way. Besides legal aspects, it will also allow us do proper bookkeeping.

Thanks a lot!

0 Kudos
Ryan_B_1
Beginner
622 Views
Alexey: okay, I'll put the diffs to Raf's diffs on the contribution page. Truth to tell, Raf did the real work here. Also following your suggestion, here's the summary:

Hello Raf,

Here's the summary. It was a quick hack as you'll see.

I created a file gcc_power_linux.h. It's just a copy of your gcc_power.h with register r0 replaced with register 3. For whatever reason, it seems that the assembler does not recognize "r0" (or "r3") as a valid register name. I had no particular basis for choosing register 3; I know next to nothing about the PowerPC ABI or the g++ conventions wrt to the PowerPC (1 is the stack pointer?)

./include/tbb/machine/gcc_power_linux.h

126,127c126,127
< mnemonic " 3,%1,%3 " /* perform operation */
< "st" X "cx. 3,0,%4 " /* store new_value */
---
> mnemonic " r0,%1,%3 " /* perform operation */
> "st" X "cx. r0,0,%4 " /* store new_value */
131c131
< : "3", "cr0");
---
> : "r0", "cr0");

In tbb_machine.h, if __PPC__ use the new file gcc_power_linux.h.

./include/tbb/tbb_machine.h

495,496d494
< #elif __PPC__
< #include "tbb/machine/gcc_power_linux.h"

In linux.inc, set the architecture.

./build/linux.inc

41,43d40
< ifeq ($(shell uname -m),ppc64)
< export arch:=ppc64
< endif

This is a probably a crude hack here. With the Cell SDK, the g++ compiler to use for the the PPU is ppu-g++. Also want 64-bit.

./build/linux.gcc.inc

42,46c42
< ifeq (ppc64,$(arc h))
< CPLUS = ppu-g++
< else
< CPLUS = g++
< endif
---
> CPLUS = g++
78,82d73
< ifeq (ppc64,$(arch))
< CPLUS_FLAGS += -m64
< LIB_LINK_FLAGS += -m64
< endif

I think that's it. It's certainly simpler than what I was trying to do with the stock distribution without your patches. And, it passes the tests.

0 Kudos
RafSchietekat
Valued Contributor III
622 Views
Thanks for the information. I had already considered letting the compiler choose a register, so this seems a good time to make that change, because it will also isolate from the problem you observed about 3 vs. r3 (note that r0 seemed to work for Mac OS X, before I changed it to r22 for symmetry with Alpha, where I thought I saw a problem related to r0), thus avoiding any need for a new file or even conditional code.

Is there some test with "uname" or anything that specifically detects a Cell processor? I imagine that a Mac G5 also prints ppc64 (?), so that seems inappropriate. I would then use a different "arch" value to specify "-b ppu". Are you sure the compiler was appropriately ./configure'd, though, or whatever this requires? It seems strange that the PPU isn't the default machine for g++ (I would imagine a default configuration targeting the PPU, next to a cross-compiling version for the SPUs), so it would be nice to know whether/why exporting that issue to TBB (and other software) is actually required. Even with a prebuilt SDK it seems more appropriate to ask its providers to provide a default g++ targeting the PPU, or it may be just a matter of setting a symbolic link yourself.

0 Kudos
Ryan_B_1
Beginner
622 Views
* Certainly having the register be implicitly chosen seems to be the right approach. (What's the syntax for referring to such a register?) And certainly having the near-identical copy of the file was a heavy-handed approach, just proof of concept that the port would work with your patches.

* Regarding ppu-g++ vs. g++, there's an issue that could be factored here. For a plain PowerPC w/Linux port, g++ is what you'd want, and in fact works on this machine: the setup is Fedora 7 for PowerPC with the Cell SDK on top, g++ refers to the Fedora 7 supplied compiler. ppu-g++ comes with the SDK and is `Cell aware'. Whether I actually need this remains to be determined--it will depend on how I rig things to get work onto the SPEs, which is early work in progress.

* Regarding how to determine that one is on Cell rather than just plain PowerPC, uname is no help, indicating only that it's ppc64. /proc/cpuinfo contains the string "cpu: Cell Broadband Engine, altivec supported"
0 Kudos
RafSchietekat
Valued Contributor III
622 Views
I suppose that an otherwise unused C++ variable provided as an __asm__ argument with constraint "=&r" will provide a suitable work register.

After some analysis, I'm leaning toward requiring "make arch=ppu" to have TBB choose ppu-g++ over g++. Would that work for you?

0 Kudos
Ryan_B_1
Beginner
622 Views
Raf_Schietekat:
I suppose that an otherwise unused C++ variable provided as an __asm__ argument with constraint "=&r" will provide a suitable work register.



I'd thought of that, but was imagining that there was a better way, but I wasn't able to find an example of such.

Raf_Schietekat:

After some analysis, I'm leaning toward requiring "make arch=ppu" to have TBB choose ppu-g++ over g++. Would that work for you?



Sounds good to me.
0 Kudos
RafSchietekat
Valued Contributor III
622 Views
Actually, what prevents you from linking g++ and ppu-g++ object code? Unless there's a problem with that, TBB itself can probably be built with g++.
0 Kudos
Reply