Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

Bodies that are Being Joined

I have an array with many elements. Each element is operated on, and then transformed. The end result is a new array of a different size. I am trying to use parallel_reduce. When the join function is called, is it guaranteed that that which is being joined that it is a tree structure? that a left leaf is being joined to a right leaf to form a new leaf? I'm worried about a leaf with, say elements 0-10000 (in a 5000 blocked_range grain size) transformed into a new array being joined with another leaf that has, say elements 110,000-120,000 (again, transformed into that new array) ?
Would a parallel_scan perhaps be more useful?
0 Kudos
2 Replies
Valued Contributor II
I think the answer to your question is contained within this statement regarding parallel_reduce: "The reduction operation should be associative, but does not have to be commutative." That is, (A ? B) ? C must equal A ? (B ? C) for reduction operation ? (whatever it is), but you don't have to worry whether A ? B is equal to B ?A. Commutation will never be a requirementbecause the order of elements and their compositions are never changed.
Black Belt
From the reference manual: "The reduction operation should be associative, but does not have to be commutative." That implies that parallel_reduce suits your purpose concerning consecutiveness.

However, because you say the output is an array with a different size, parallel_scan might be more useful if you can have it compute output array indexes in a pre-scan (this computation actually needs to be done in both scans, and sometimes there won't even be a pre-scan), and compute output array contents in the final scan. If you know a safe and manageable upper limit of the destination array's size, an actual array should be suitable, otherwise you'll need a tbb::concurrent_vector, which is slower, and you would do well to use its iterator type's ++ and -- operators as much as possible in preference to random access (the iterator can be copied).

(Added) No, I didn't edit this, that's just your imagination (not a very safe thing in this particular context, BTW)!