Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

parallel_sort

msr999
Beginner
879 Views
Hi,

I am trying to do some testing with parallel_for. I tried a simple test program and parallel_sort actually 50% slower than regular sort. I tried this on Linux 2.6.28 with gcc 4.3.3 compiler. I ran the test on two different machines (one with 2 cores and one with 4 cores). If I look at cpu consumption, parallel_sort actually engages all cpus 100% while the regular std::sort only uses one cpu. So I am not sure whats happening. Is there something wrong with my program.

I am pasting the code here. Let me know thanks?

#include "tbb/parallel_sort.h"
#include "tbb/task_scheduler_init.h"
#include "tbb/tick_count.h"

#include
#include

using namespace tbb;
using namespace std;


int main(int argc, char * argv[])
{
task_scheduler_init init;

srand(time(NULL));

ulong num_ele = 10000000;
if (argc > 2)
num_ele = boost::lexical_cast(argv[2]);

cout << " Sorting " << num_ele << " longs " << endl;

long *elements = new long[num_ele];
cout <<"Initializing random data. " << endl;

for (ulong i=0; i < num_ele; ++i)
{
elements = ((ulong)rand() << 32) + rand();
}

tick_count t0 = tick_count::now();

if (argc > 1 && *(argv[1]) == 'r' )
{
cout << "Doing single threaded std::sort" << endl;
std::sort(elements, elements + num_ele);
}
else
{
cout << "Doing Parallel Sort" << endl;
parallel_sort(elements, elements + num_ele);
}

tick_count t1 = tick_count::now();

cout << "Time took to sort " << (t1 - t0).seconds() << " secs " << endl;
delete [] elements;
};
0 Kudos
5 Replies
Alexey-Kukanov
Employee
879 Views
You might use an older TBB version. Update to a newer version, e.g. the last commercial-aligned release corresponding to TBB 2.1 Update 4.
0 Kudos
Bartlomiej
New Contributor I
879 Views
Here are the results on my Intel Core 2 Quad Q6600 and the stable release (I added the number of threads when initializing the task scheduler):

serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs

So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.

Best regards

0 Kudos
msr999
Beginner
879 Views
You might use an older TBB version. Update to a newer version, e.g. the last commercial-aligned release corresponding to TBB 2.1 Update 4.

I was using the latest release. Anyway its working great now. I get ~50% better performance. I am not sure what was wrong before. Completely puzzled. Quite sure I did somethign something stupid in my old runs (like linking to debug lib). Anyway thanks for prompt response.

While searching the forums, I found a similar issue reported by another user. It seems like you have identified it as Windows Specific. Do you know if there are any fixes to that issue?


THanks
mSR
0 Kudos
msr999
Beginner
879 Views
Quoting - bartlomiej
Here are the results on my Intel Core 2 Quad Q6600 and the stable release (I added the number of threads when initializing the task scheduler):

serial: 2.545 secs
parallel, 1 thread: 3.68352 secs
parallel, 2 threads: 1.93752 secs
parallel, 4 threads: 1.07267 secs

So, it's clearly faster, but the speedup is not stunning.
There seem to be a large overhead for starting the threads. For a larger problem (like lasting a few minutes) the difference should be more encouraging.

Best regards



I tested with array of 500M longs on Quadcore and there is substantial improvement (66 sec Vs 26 sec).
On DualCore with 40M elements I can get around 50% improvement (45 sec Vs 30 Sec).
0 Kudos
Alexey-Kukanov
Employee
879 Views
Quoting - msr999
While searching the forums, I found a similar issue reported by another user. It seems like you have identified it as Windows Specific. Do you know if there are any fixes to that issue?

Yes, there was an earlier problem and we fixed it in an update to TBB 2.1. That's why I supposed you might have had an older version, and recommended you to update. If you use the last 2.1 update, you have the fix.
0 Kudos
Reply