- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have tried to implement Dr. Dobb's example with parallel_for. It gives a very bad performance, a scaling of 0.03! Can anyone help fix the problem please? The file is attached.
bad.parallel_for__sum.cpp
TIA,
-S
I have tried to implement Dr. Dobb's example with parallel_for. It gives a very bad performance, a scaling of 0.03! Can anyone help fix the problem please? The file is attached.
bad.parallel_for__sum.cpp
TIA,
-S
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Like this, the implementation uses one task per array element, which is obviously too much overhead. Using auto_partitioner (which is not the default merely for historical reasons) with parallel_for, instead of the default simple_partitioner, should provide instant relief.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Raf Schietekat
Like this, the implementation uses one task per array element, which is obviously too much overhead. Using auto_partitioner (which is not the default merely for historical reasons) with parallel_for, instead of the default simple_partitioner, should provide instant relief.
Thanks Raf.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page