- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Another is that I have a problem.

I have this 2d vector of type int.

Let's say

9 23 5 3 4 55 6

3 12 456 56

34 7 2

1 4 6

and i want to sort each row. I have implemented using parallel for it but it turned out to be slower than the serial implementation. Maybe the problem is my STL. Can anyone help?

here is class template

class ApplySort {

vector

public:

void operator( )(const blocked_range

for( long row=r.begin(); row

int key;

for(int j=1;j

key=bucket[row]

i=j-1;

while(i>=0 && bucket[row]

*>key )*

{

bucket[row][i+1]=bucket[row]

{

bucket[row][i+1]=bucket[row]

*;*

i--;

}

bucket[row][i+1]=key;

}

}

}

ApplySort(vector >::iterator &buck) :

bucket(buck)

{}

};

//here is my template function

ApplySort myClass(bucket.begin());

parallel_for(blocked_range(0,buck_num,(buck_num-1)/2), myClass);

where buck_num is the number of rows...

Would I try using blocked_range2d and chage my implementation in the template class?? i--;

}

bucket[row][i+1]=key;

}

}

}

ApplySort(vector

bucket(buck)

{}

};

//here is my template function

ApplySort myClass(bucket.begin());

parallel_for(blocked_range

where buck_num is the number of rows...

Would I try using blocked_range2d and chage my implementation in the template class??

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I would recommend touse a library sort function in each bucket, and to make your grain size in terms of number of buckets or of total number of elements in those buckets. Currently you're putting a fixed number of tasks at work whatever the problem size, which is not the purpose of specifying a grain size. Try different values to find an optimum, and don't be shy exploring them exponentially (your problem will need to be many times bigger before you'll see any benefit).

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I have a question myself: is vector thread safe on all current platforms for this kind of access? Did I overlook any promises to that effect in C++0x? Or am I just too paranoid?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I would recommend touse a library sort function in each bucket, and to make your grain size in terms of number of buckets or of total number of elements in those buckets. Currently you're putting a fixed number of tasks at work whatever the problem size, which is not the purpose of specifying a grain size. Try different values to find an optimum, and don't be shy exploring them exponentially (your problem will need to be many times bigger before you'll see any benefit).

[cpp]class MySort { vector< vector> &vec; public: MySort(vector< vector >& vec_) : vec(vec_) {} void operator() (const tbb::blocked_range &r) const { for(size_t i = r.begin(); i < r.end(); ++i) { tbb::parallel_sort(vec .begin(), vec.end()); } } }; void Sort2D(vector &vec) { tbb::parallel_for(tbb::blocked_range(0, vec.size()), MySort(vec), tbb::auto_partitioner() ); } [/cpp]

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page