Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

timelzayus

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
05:16 AM

161 Views

break in parallel_for

I'd like to 'scan' a vector and stop on the **first** corresponding data.

In serial programming, it would give :

int matchIndex = -1;

for (int i = 0; i < myVector.size(); ++i) {

if (elementMatch(myVector*)) {*

matchIndex = i;

break ;

}

}

I don't know how to do this in a parallel_for, can someone help ???

Thank you very much.

Link Copied

11 Replies

RafSchietekat

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
07:30 AM

161 Views

ARCH_R_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
07:46 AM

161 Views

The expected average number of matches strongly influences what kind of algorithm to use. Do you expect many matches, a few, or less than one in the average case?

RafSchietekat

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
08:05 AM

161 Views

ARCH_R_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
09:09 AM

161 Views

If the match is expected to come early in the sequence, then the threads should concentrate on the leftmost portion of the sequence first. Here's stab at doing so. The key idea is to use an atomic counter to pick up the next chunk. The downside is that fetch_and_add has limited scalability, unless you have a vintage NYU Ultracomputer :-) For for small core counts it might work out okay.

[cpp]// Do "y=x" if it decreases the value of y. templatevoid write_min( tbb::atomic & y, T x ) { for(;;) { T snapshot = y; if( snapshot int search( int n, F pred ) { size_t chunk_size = 10000; tbb::atomic i; i = 0; tbb::atomic result; result = n; tbb::parallel_for( tbb::blocked_range (0,n,chunk_size), [&](tbb::blocked_range r) { int begin = i.fetch_and_add(r.size()); int end = begin+r.size(); // Quick rejection test if( begin

smasherprog

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
09:24 AM

161 Views

The exact value of the bool doesnt matter, only that it has a value other than 0 is what matters, so this should work no problem.

timelzayus

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
09:58 AM

161 Views

I'll go with that kind of solution.

I don't think parallel_reduce is a solution because it forces to scan all the vector while I wish to stop at the first match (in order)...

RafSchietekat

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
11:25 AM

161 Views

smasherprog

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
12:09 PM

161 Views

RafSchietekat

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
12:29 PM

161 Views

timelzayus

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
12:57 PM

161 Views

nor why you useintbegin=i.fetch_and_add(r.size());intend=begin+r.size();

instead of begin = r.begin(); end = r.end();

Could you explain ???

Thank you.

ARCH_R_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-01-2010
03:15 PM

161 Views

The idea is that the parallel_for hands out the total number of iterations that we want, but the iteration values themselves are issued from variable i. The parallel_for is good for handing out iterations in parallel, but tends to hand out iteration subranges [r.begin(),r.end()) that are scattered all over the iteration space. If matches are frequent enough (and this is problem dependent) all work done to the right of the leftmost match is wasted work. So the intuition is to make threads grab iterations from left to right. The variable i is the counter from which they grab those iterations.

For example, suppose n=1001 and chunk_size=300, and there are two threads.The parallel_for with simple_partitioner will subdivide the iteration space into:

- [0,250)
- [250,500)
- [500,750)
- [750,1001)

The peculiar (and non-deterministic) order that parallel_for hands out iterations is the reason that cancellation tricks (e.g. a boolean flag or using TBB's task cancellation features) are not safe to use. If we find a match for index k, we must be sure that we checked all lower values of k for a match too. There's likely a way to modify the collapsing range hack to do the cancellation correctly. The modification would have to replace my_stop with code to report the range as indivisible if it extends past the lowest k found so far.

As Raf pointed out, there are crafty ways to use parallel_do and input iterators. The iterators must not be random-access iterators, otherwise parallel_do acts like parallel_for, to get similar left-to-right bias when processing the iteration space. Though using parallel_do like that is depending somewhat on internal implementation details. I went with the parallel_for approach because it can be adapted to work with Cilk Plus' cilk_for too.

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.