Help in understanding parallel_scan

balaji_ram · ‎05-03-2009

Hello,

I have the read the Intel TBB documentation regarding parallel_scan.
But I still havent figured out how exactly this works. For me it seems quite impossible to parallelize
an alogorithm that depends on the result of the previous computation (for ex : fibonacci series).
But parallel_scan seems to be useful in exactly such scenarios.

Can you please guide me some explanation (or) lead me to some link which can give me a clear picture of how parallelization is done in suchalgorithms ?

Thanks very much.

Balaji Ram.

RafSchietekat · ‎05-04-2009

Fibonacci is probably a bad example because it can be computed in wildly disparate ways, with a serial loop somewhere in-between as far as efficiency goes, but parallel_scan does not seem to be applicable, nor would it necessarily be very efficient compared to what a bit of mathematics can do for you (see the Tutorial about that). The first example I would think of instead is integral approximation: conceptually, each subgroup's contribution can be independently computed (concurrency), and then the results can be redistributed (alas, some serial dependencies) and locally merged (more concurrency).

I see that the word "parallel_scan" does not occur in "Tutorial (Open Source).pdf", current revision 1.13, but it does in "Reference (Open Source).pdf", current revision also 1.13, and there it has been documented rather nicely, I would say (see "Problems using parallel_scan" about that).

balaji_ram · ‎05-04-2009

Quoting - Raf Schietekat

Fibonacci is probably a bad example because it can be computed in wildly disparate ways, with a serial loop somewhere in-between as far as efficiency goes, but parallel_scan does not seem to be applicable, nor would it necessarily be very efficient compared to what a bit of mathematics can do for you (see the Tutorial about that). The first example I would think of instead is integral approximation: conceptually, each subgroup's contribution can be independently computed (concurrency), and then the results can be redistributed (alas, some serial dependencies) and locally merged (more concurrency).

I see that the word "parallel_scan" does not occur in "Tutorial (Open Source).pdf", current revision 1.13, but it does in "Reference (Open Source).pdf", current revision also 1.13, and there it has been documented rather nicely, I would say (see "Problems using parallel_scan" about that).

Thanks very much. I also have the parallel_scan.pdf by Arch Robison from the above link. This will surely help.