Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2465 Discussions

TBB Future with respect to cluster and possibly GPU

shachris23
New Contributor I
678 Views
This question is aimed towards TBB Intel developers.

Is it one of the eventual goals to have TBB also utilizes not only multi-core, but also mulit-computer and perhaps GPU cards (Larrabee, Tesla, etc...) ? I think I'm more interested in utilizing multi-computer than utilizing GPU cards since GPUs have a different architecture and programming API compared to a regular PC (at least based on all the GPUs on the market right now).

It would be like TBB + MPI except that the syntax and usage would be quite similar to as it is now with how TBB app is executed. In my opinion, the app developer shouldnt have to use 2 different syntaxes (1 for TBB and 1 for MPI) to utilize his computing resource---the same philosophy as TBB where I dont have to know how many cores the app is going to be running on.

Any thoughts?

0 Kudos
9 Replies
AJ13
New Contributor I
678 Views
I don't work for intel, but I have been doing research on this problem.

I have a demonstration of my work next week in Toronto, but basically I have TBB + GPU working just fine. I am currently putting together a demo of my work extended to clusters.
0 Kudos
shachris23
New Contributor I
678 Views
Quoting - AJ
I don't work for intel, but I have been doing research on this problem.

I have a demonstration of my work next week in Toronto, but basically I have TBB + GPU working just fine. I am currently putting together a demo of my work extended to clusters.

Wow AJ...that would be a great thing if you could make it happen. I hope to see your successful progress getting utilized or at least considered by Intel!!

S.
0 Kudos
robert_jay_gould
Beginner
678 Views
Quoting - shachris23

Wow AJ...that would be a great thing if you could make it happen. I hope to see your successful progress getting utilized or at least considered by Intel!!

S.


Nevertheless you must consider that the work a GPU can do is limited, you'll have a hard time parsing xml on the GPU for example (although I guess someone with too much time on their hands will, has, already done this).

The other alternative is that the new unified hardware in the future (if that trend catches on), such as Intel's Larrabee (CPU==GPU), would automatically work with TBB, and other parallel libraries, since the chip is a bunch of CPUs for joined together GPU programming (in barbaric terms).


0 Kudos
knmaheshy2k
Beginner
678 Views
Quoting - AJ
I don't work for intel, but I have been doing research on this problem.

I have a demonstration of my work next week in Toronto, but basically I have TBB + GPU working just fine. I am currently putting together a demo of my work extended to clusters.


What platform is the GPU working on?
0 Kudos
vu64
Beginner
678 Views

Scientific computing has been using MPI and OpenMP for long (but not easy of course). There are many libraries such as STAPL or CHARM++ which support both shared-memory and distributed architectures. You can have a look.

0 Kudos
jimdempseyatthecove
Honored Contributor III
678 Views

>>would automatically work with TBB

I wouldn't go so far as to say that. TBB will need revisions.

Consider that TBB is build for SMP. First implementation of Larrabee (my guess) will be as GPU card. The cores within the Larrabee GPU card, although sharing much the same instruction set, are not "processors" in the SMP set of processors as viewed by the O/S. Initially I would expect the entire Larrabee GPU card to require: allocate resource, use resource (call), release resource. i.e. follow the model developed for ATI, nVidia, and others.

The Larrabee GPU card (my guess) will have to functionally be close enough to the other GPU cards. i.e. serve as the PC's video card concurrent with any other shenanigans us programmers want to use the card for. As such, this will require the allocate resource, use resource (call), release resource programmingmodel (at least initially). This card will likely have an internal O/S. Likely a EPROM that supports basic VGA/EVGA from hard boot, then capable of sucking in extensions via driver load from the PC's booting O/S.

One of these extensions will be a shell to manage a task pool for use by applications. This shell could include a targeted version of TBB that runs bound to (within)the Larrabee GPU dedicated address space. You would still have a barrier between what runs inside the Larrabee GPU and will (may) continue to require the allocate/deallocate as resource (to bind/map Virtual Memory page tables). Potentially individual cores within the Larrabee could be alloacated (and mapped to the VA of the app) but it is not entirely clear (yet) how this will interrelate with the high speed message rings interconnecting the Larrabee cores. This allocation of individual Larrabee cores could be analogous to hot swap of CPUs (some servers support this but will need new kernel for Larrabee transmorgrafication). The two TBB's could handle this better. (and be implemented sooner).

When I get my hands on a Larrabee GPU I will incorporate this feature into QuickThread.

At some point in the future, somone will build a motherboard without a socket for CPU (e.g. without socket LGA 1366) rather with the Larrabee being the only "processor". But this is a chicken and egg thing since it will require a change to the O/S. It wouldn't surprise me if Intel has a few such prototypes.

Jim Dempsey
0 Kudos
AJ13
New Contributor I
678 Views

At some point in the future, somone will build a motherboard without a socket for CPU (e.g. without socket LGA 1366) rather with the Larrabee being the only "processor". But this is a chicken and egg thing since it will require a change to the O/S. It wouldn't surprise me if Intel has a few such prototypes.

Jim Dempsey

I really doubt that. AFAIK Larrabee has no TLB.. no protection levels... I doubt it will handle interrupts, etc. Running virtual machines on each core, maybe. If you start to add a TLB and all that stuff, you're going to get back to an ordinary processor. I tend to think of Larrabee the same as a GPU, just a different instruction set than our GPUs... it's just a nice co-processor. Perhaps one day a minimalistic subset of the features an OS needs will be incorporated into Larrabee. At least, I don't remember Larrabee having any of these features planned.

What I think is most likely to happen long term is something like Cell's architecture. A full processor to run the OS, side-by-side with the fancy Larrabee or whatever else. But that's a long ways away :-)
0 Kudos
AJ13
New Contributor I
678 Views
Quoting - knmaheshy2k


What platform is the GPU working on?

It was CUDA, but moving to OpenCL slowly.
0 Kudos
knmaheshy2k
Beginner
678 Views
Quoting - AJ

It was CUDA, but moving to OpenCL slowly.

Thanks AJ. I would like to discuss and share some of my experiences with another platform. If you are interested do mail me on gmail (knmaheshy2k).
0 Kudos
Reply