Does TBB follows the Task farming (master/worker) paradigm?
If yes, then how. If not, can you give a parallel programming language that follows it - other than cilk.
Task Farming is a method to achieve high concurrency in a computation by isolating the individual tasks on separate processors so that they don't interfere with each other. It is basically a big single-threaded batch environment with a task initiator that varies some aspect of the individual tasks such as experimental parameters so as to explore a complex computational space in parallel. The technique does not deal with multi-threading or thread interactions in a process so it has little to do with the methods available in Intel Threading Building Blocks.
That doesn't sound very inviting... Shouldn't the answer instead be: sure, this is trivial for TBB, if you don't need separate processes, and if you take responsibility to order tasks by expected runtime if necessary?
If you "only" need the power of a single computer (and you don't mean task farming to imply the use of a server farm, as seems to be done often but not always, but then the original question would seem rather strange), and unless I/O issues interfere, to me TBB seems easier for master/slave parallelism than inter-process communication.
"However I think the OP would benefit if he tried the task handling used by TBB to make his application threaded :)"
Absolutely: with recursive parallelism, there's less need to worry about starting the biggest top-level tasks first.
Thank you. I got it.
Can you suggest a parallel programming language that follows the task farming paradigm? i.e. a single threaded master deligates big tasks to other processors.
I've already worked with ZPL which follows the SPMD (Single program multiple data) paradigm.
Have you looked at using MPI (OpenMPI)? This might be suitable.
Creating your own threads then using something like WaitForSingleEvent and a context queue could be easy to implement. You can even use OpenMP as an easy way to initialize the thread pool then use the WaitForSingleEvent or other synchronization technique
#pragma omp parallel
if(omp_get_thread_num() == 0)
Depending on the wait times in DoMaster you may want to overscribe by 1 thread. (while master is waiting for I/O all the cores are busy farming). You might want to enable nested parallelism and experiment with the number of farmers versis available threads. A tasking system might provide for better core utilization than using a more threads than cores route with messaging events.
Are you using a Windows SMP platform? If so I can give you a beta copy of QuickThread to review/experiment with. It is relatively easy to integrate into existing applications. (email me at firstname.lastname@example.org current info) or Google search "QuickThread site:intel.com" for an older reference document.
QuickThread is a task pooling system like TBB but alsohas a characteristicsimilar to an event messaging system. The master thread loop would, depending on input, determine what next to do and throw a task request. These tasks can throw additional task requests and use parallel constructs forloops. Tasks can have completion events. The master can throw a "do this, and when you are done do this other thing (report back to the Master Loop)".
At the risk of continued criticism aboutnot sounding inviting in my response to the "task farming paradigm," I will continue in my line of reasoning andinsist that it sounds like we're talking abouttwo different beasties here. The references I could find for "task farming" were pretty specific about its nature as scheduling technique for distributing a collection of single-task invocations of a program over a server farm or cluster, each invocation running in isolation, not interfering with each other. In some sense, the Google MapReduce operates in this space, running the same query across a mass of machines, each processing different data. I have personal experience using such a technique; at one point in my work life I studied particular algorithms running on a proposed architecture family by "net-batching" a set of simulations, all using the same input data but varying parameters that controlled the architectural model; for all intents and purposes these were independent runs that fit squarely in the "task farming paradigm."
I also did a quick scan of ZPL. This looks like a superset of Pascal (Modula, maybe?) with some added syntax to support data parallelism. Though you might be able to use this language to implement "task farming," it looks more closely aligned with cluster data parallel programming techniques for describing large array problems intended to be divided across a cluster. The two code samples I saw were the Jacobi operator and matrix multiplication. ZPL relies on an inherent cluster communications fabric (MPI and RDMA are both mentioned) and at least for the Jacobi example, the interaction model is anything but "separate, independent tasks." Each task has to communicate with "neighbor" tasks at some array partitioning granularity in order to communicate the intermediate values in the relaxation method across the array, until the error is minimized to an acceptible epsilon. In ZPL this is all hidden in the implementation below the level of the language (though the new Grid construct seems to offer a window for controlling the underlying computational fabric).
Though MapReduce provides a form of task farming, it's not at a language level. If what you're looking for is a data-parallel languagethat works in a way like ZPL (the omega to the alpha of APL?), it looks like most of the work was done back in the 1990s, with such contributions as NESL, C-HELP, C// (a special purpose language for the GFLOPS machine), etc. The advent of GPGPU revitalized the data parallel world around narrow width vector processing characteristic of modern Graphics Processing Units with languages as obscure as 81/2and the more mainstream Brook, RapidMind (which grew out of Sh, another GPGPU "language") and others. Intel has its own entry in this burgeoning "data-parallel" market, Ct(which I've heard uses TBB under the hood). How's that for bringing the story full circle?
:-) That was just for effect, not criticism, but in hindsight it does look like I was on a "different" track.
In my defence, however, I did wonder how "task farming" was meant here, but because it's not even in Wikipedia yet, because of one of the first things I did find, and mainly because of the way the question was formulated ("master/worker", reference to cilk), I thought that it was useful to point out (in a suitably qualified manner) that the same program structure (if not deployment structure) can easily be applied using TBB.