- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am trying to do some testing with parallel_for. I tried a simple test program and parallel_sort actually 50% slower than regular sort. I tried this on Linux 2.6.28 with gcc 4.3.3 compiler. I ran the test on two different machines (one with 2 cores and one with 4 cores). If I look at cpu consumption, parallel_sort actually engages all cpus 100% while the regular std::sort only uses one cpu. So I am not sure whats happening. Is there something wrong with my program.
I am pasting the code here. Let me know thanks?
#include "tbb/parallel_sort.h"
#include "tbb/task_scheduler_init.h"
#include "tbb/tick_count.h"
#include
#include
using namespace tbb;
using namespace std;
int main(int argc, char * argv[])
{
task_scheduler_init init;
srand(time(NULL));
ulong num_ele = 10000000;
if (argc > 2)
num_ele = boost::lexical_cast(argv[2]);
cout << " Sorting " << num_ele << " longs " << endl;
long *elements = new long[num_ele];
cout <<"Initializing random data. " << endl;
for (ulong i=0; i < num_ele; ++i)
{
elements = ((ulong)rand() << 32) + rand();
}
tick_count t0 = tick_count::now();
if (argc > 1 && *(argv[1]) == 'r' )
{
cout << "Doing single threaded std::sort" << endl;
std::sort(elements, elements + num_ele);
}
else
{
cout << "Doing Parallel Sort" << endl;
parallel_sort(elements, elements + num_ele);
}
tick_count t1 = tick_count::now();
cout << "Time took to sort " << (t1 - t0).seconds() << " secs " << endl;
delete [] elements;
};
I am trying to do some testing with parallel_for. I tried a simple test program and parallel_sort actually 50% slower than regular sort. I tried this on Linux 2.6.28 with gcc 4.3.3 compiler. I ran the test on two different machines (one with 2 cores and one with 4 cores). If I look at cpu consumption, parallel_sort actually engages all cpus 100% while the regular std::sort only uses one cpu. So I am not sure whats happening. Is there something wrong with my program.
I am pasting the code here. Let me know thanks?
#include "tbb/parallel_sort.h"
#include "tbb/task_scheduler_init.h"
#include "tbb/tick_count.h"
#include
#include
using namespace tbb;
using namespace std;
int main(int argc, char * argv[])
{
task_scheduler_init init;
srand(time(NULL));
ulong num_ele = 10000000;
if (argc > 2)
num_ele = boost::lexical_cast
cout << " Sorting " << num_ele << " longs " << endl;
long *elements = new long[num_ele];
cout <<"Initializing random data. " << endl;
for (ulong i=0; i < num_ele; ++i)
{
elements = ((ulong)rand() << 32) + rand();
}
tick_count t0 = tick_count::now();
if (argc > 1 && *(argv[1]) == 'r' )
{
cout << "Doing single threaded std::sort" << endl;
std::sort(elements, elements + num_ele);
}
else
{
cout << "Doing Parallel Sort" << endl;
parallel_sort(elements, elements + num_ele);
}
tick_count t1 = tick_count::now();
cout << "Time took to sort " << (t1 - t0).seconds() << " secs " << endl;
delete [] elements;
};
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might use an older TBB version. Update to a newer version, e.g. the last commercial-aligned release corresponding to TBB 2.1 Update 4.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here are the results on my Intel Core 2 Quad Q6600 and the stable release (I added the number of threads when initializing the task scheduler):
serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs
So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.
Best regards
serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs
So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.
Best regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - Alexey Kukanov (Intel)
You might use an older TBB version. Update to a newer version, e.g. the last commercial-aligned release corresponding to TBB 2.1 Update 4.
I was using the latest release. Anyway its working great now. I get ~50% better performance. I am not sure what was wrong before. Completely puzzled. Quite sure I did somethign something stupid in my old runs (like linking to debug lib). Anyway thanks for prompt response.
While searching the forums, I found a similar issue reported by another user. It seems like you have identified it as Windows Specific. Do you know if there are any fixes to that issue?
THanks
mSR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - bartlomiej
Here are the results on my Intel Core 2 Quad Q6600 and the stable release (I added the number of threads when initializing the task scheduler):
serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs
So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.
Best regards
serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs
So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.
Best regards
I tested with array of 500M longs on Quadcore and there is substantial improvement (66 sec Vs 26 sec).
On DualCore with 40M elements I can get around 50% improvement (45 sec Vs 30 Sec).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - msr999
While searching the forums, I found a similar issue reported by another user. It seems like you have identified it as Windows Specific. Do you know if there are any fixes to that issue?
Yes, there was an earlier problem and we fixed it in an update to TBB 2.1. That's why I supposed you might have had an older version, and recommended you to update. If you use the last 2.1 update, you have the fix.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page