I'm wondering how TBB relates to the new concurrency featuresof C++ 11.
Can they be freely mixed and matched without problems or would it be better to stick to one or the other troughout? Is the TBB implementation of
So how does TBB fit in with C++ 11? Are there some general guidelines?
Raf,
Regarding the side-discussion about what constitutes a compiler product. You claim it's afact that compilers generally come withouta runtime environment. This iswrongI'm afraid. The wast majority of compiler productscontain everything you need to produce a running program. But if your point is that the standard allocator has no partincompilation per sethen I agree.
Well, one way toresolve conflicts ala over/under-subscription would be forthe C++ standard to provide aninterfacethat would expose the status of, and even control over, the C++ concurrency. It could be used by TBB and other third-party libraries to ensureoptimal coexistence. Hopefully something like that is in the pipeline because this must be a known and urgent issue.
And yes, policies and guidelines must be provided byan official body to be of any value. I think it's overduebecause with C++ 11 there's a brand new concurrency landscape that strongly affects TBB.
Thanks everyone.
Raf,
I don't understand your last reply.
Areyou seriously claiming that the typicalcompiler product comeswithout the necessary componentsto produce a runningprogram out of the box?
Do you reallythinkC++ compiler manufacturerscan afford to presumeevery buyerhasacompatible standard allocator already installed?And do you really think they would even want that?The memory allocator is so crucialto performancethat most manufacturers wouldinsistexactly theirdesignated allocator be used together with their compiler product.
People who buy a C++ compiler expect to get a complete C++ implementation. They expect to be able produce runnable programs. If the standard allocator is missingit's like buying a car and it's delivered withoutan engine and when you complain the seller tells you to use theengine everybody is expected to havelaying around intheir garage.
Anybodywith just a little computing experience realizes the above but that's not the realissue here, is it? Both you and I know why you started this sidetrack discussion. It was a cheap attempt to induce doubt in my competence. Well, it misfiredand now you'rethe one who'slooking silly. Better luck next time.
uj wrote:
1. Don't use the TBB allocators. The built-in allocators that come with the C++11 compilers are designed to handle concurrent memory allocations and generally are both faster and more reliable.The TBB allocators are now obsolete.
I don't agree with this. I don't think libstdc++ or libc++ reimplement a heap manager, so their allocators drop into malloc+free, which use glibc's ptmalloc on Linux machines (and possibly others). TBB's allocator is a high-performance alternative to ptmalloc. Other allocators include tcmalloc from Google Perftools and JEMalloc, both of which you can read about online.
It is very hard to say which heap manager is best, but I know many codes that use TBB malloc instead of glibc ptmalloc and find that it helps a lot with multithreaded C++ codes.
You can look at https://gcc.gnu.org/onlinedocs/libstdc++/manual/memory.html and the many StackOverflow posts about how new is implemented for details.
uj wrote:
2. Prefer standard C++ 11over TBB. Minimal use of TBBsubstantially reduces therisk ofconflicts with thebuilt-in C++concurrency mechanism. Use TBB only as a last resort and avoid italtogether if possible.
TBB has a lot more features than C++11 or the C++17 parallel STL (PSTL). Intel's PSTL is implemented on top of TBB. PSTL performs the same as TBB with the default options, but TBB performs much better when using multidimensional blocked ranges, for example. TBB provides concurrent queues, which a C++11 user would have to implement on their own. TBB flowgraph provides a rich set of features that aren't available from C++ on its own.
In any case, I've recently written code that implements the same algorithm in TBB, C++17 PSTL, OpenMP 4, and many other models, in order to make scientific comparisons of their features and performance. Please see https://github.com/ParRes/Kernels/tree/master/Cxx11. I recently removed the Cilk C++ implementation but the C one is still there. You might find this code useful to understand the merits of different threading models. If you have any questions or problems, please create GitHub issues rather than posting them here.
For more complete information about compiler optimizations, see our Optimization Notice.