- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Good day, colleagues!
I'm newbie in concurrent programming and I've encountered a problem with parallel_sort. Currently I'm writing small program which is have to sort big binary files with limited amount of memory.
At the first step, I'm reading file to be sorted, split file to chunks (for example 10 MB each) and sort each chunk. The problem is when I'm applying parallel_sort to chunk, it performs more than 3 times slower than std::sort. Could you advice me, what I'm doing wrong? Thank you in advance.
Code is attached.
My machine is Core i7 860, compiler - Visual Studio 2010.
I'm newbie in concurrent programming and I've encountered a problem with parallel_sort. Currently I'm writing small program which is have to sort big binary files with limited amount of memory.
At the first step, I'm reading file to be sorted, split file to chunks (for example 10 MB each) and sort each chunk. The problem is when I'm applying parallel_sort to chunk, it performs more than 3 times slower than std::sort. Could you advice me, what I'm doing wrong? Thank you in advance.
Code is attached.
My machine is Core i7 860, compiler - Visual Studio 2010.
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Colleagues,
may be my code was too tagled. I've created a new simple code where I just create a vector and concurrent vector (both are 1M integers) and sort them via std::sort and tbb::parallel_sort respectively. Running times are 1500 and 8000 CPU clocks respectively - std::sort is 5 times faster.
What is the problem in my code?
#include
#include
#include
#include "tbb\parallel_sort.h"
#include "tbb\concurrent_vector.h"
#include
using std::vector;
using tbb::concurrent_vector;
using tbb::parallel_sort;
const int SIZE = 1000000;
void Generate_Vector (int size, vector * target) {
target->resize(size);
for (int index = 0; index < size; ++index) {
target->at(index) = rand();
}
}
int main () {
srand (300);
vector serial;
Generate_Vector(SIZE, &serial);
concurrent_vector parallel (serial.begin(), serial.end());
clock_t start, finish;
start = clock();
std::sort(serial.begin(), serial.end());
finish = clock();
std::cout << "std::sort time is " << finish - start << std::endl;
start = clock();
tbb::parallel_sort (parallel.begin(), parallel.end());
finish = clock();
std::cout << "parallel sort time is " << finish - start << std::endl;
return 0;
}
may be my code was too tagled. I've created a new simple code where I just create a vector and concurrent vector (both are 1M integers) and sort them via std::sort and tbb::parallel_sort respectively. Running times are 1500 and 8000 CPU clocks respectively - std::sort is 5 times faster.
What is the problem in my code?
#include
#include
#include
#include "tbb\parallel_sort.h"
#include "tbb\concurrent_vector.h"
#include
using std::vector;
using tbb::concurrent_vector;
using tbb::parallel_sort;
const int SIZE = 1000000;
void Generate_Vector (int size, vector
target->resize(size);
for (int index = 0; index < size; ++index) {
target->at(index) = rand();
}
}
int main () {
srand (300);
vector
Generate_Vector(SIZE, &serial);
concurrent_vector
clock_t start, finish;
start = clock();
std::sort(serial.begin(), serial.end());
finish = clock();
std::cout << "std::sort time is " << finish - start << std::endl;
start = clock();
tbb::parallel_sort (parallel.begin(), parallel.end());
finish = clock();
std::cout << "parallel sort time is " << finish - start << std::endl;
return 0;
}
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is the reason for usage of tbb::concurrent_vector?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
First I've tried to use std::vector, but it worked even slower, and CPU load was only 20-40% while with concurrent_vector it was 100%.
Important update - all results above were derived from Debug configuration. When I switched to Release and used std::vector, all become OK - CPU times was 78 for std::sort and 26 for tbb::parallel sort.
Important update - all results above were derived from Debug configuration. When I switched to Release and used std::vector, all become OK - CPU times was 78 for std::sort and 26 for tbb::parallel sort.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Debug versions of STL have a LOT of additional non-scalable checks. For a example an STL container can have a mutex-protected sub-container of all iterators to into it, since it's mutex-protected, it's non-scalable.
If you are using MSVC try define:
# define _SECURE_SCL 0
# define _HAS_ITERATOR_DEBUGGING 0
# define _ITERATOR_DEBUG_LEVEL 0
If you are using MSVC try define:
# define _SECURE_SCL 0
# define _HAS_ITERATOR_DEBUGGING 0
# define _ITERATOR_DEBUG_LEVEL 0
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page