- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm currently trying to learn how to use Intel TBB and I tried to write a simple short program using parralel_reduce.
However, the parallel version is 10 times slower than the sequantial one...
I really can't understand why the parallel version is that slower, could you help me figure out why please ?
Here are both of my programs :
I compiled them with g++ under Fedora 12 linux and my computer has a dual core processor.
Thank you for your help.
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
And what version of TBB?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"And what version of TBB?"
Dmitriy probably means (sorry...) that before TBB 2.2 the default partitioner was simple_partitioner, and to get acceptable performance with that you have to set an appropriate grainsize (third parameter of blocked_range, e.g., 1000), or otherwise you must explicitly specify auto_partitioner as the thrd argument of parallel_reduce() (the default since TBB 2.2).
Dmitriy probably means (sorry...) that before TBB 2.2 the default partitioner was simple_partitioner, and to get acceptable performance with that you have to set an appropriate grainsize (third parameter of blocked_range, e.g., 1000), or otherwise you must explicitly specify auto_partitioner as the thrd argument of parallel_reduce() (the default since TBB 2.2).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed, I am using the 2.1 version and I did not specify neither a partitioner, nor a grainsize.
It works very well now :)
Thank you very much for your help :D
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page