Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

parallel_do problems

sinedie
Beginner
457 Views
Hello,
I am trying to learn how to traverse a tree using parallel_do. The following is the code that I currently have, and I am would like to fix a few problems that I see with it. The problems are:

a. This program requires global variables. Is there an alternative to using globals? In other words, is it possible to send some extra parameters to the overloaded operator()? When I tried, it gave too many errors that I could not fix.

b. Doing multiple types of parallel operations using the following code looks impossible as the class Node itself is overloaded with the operator (). I tried moving it into the body of BST, but ended up with many errors. Is there a way out?

c. How to terminate parallel_do prematurely?

Here is the code:
[cpp]#include 
#include "tbb/task_scheduler_init.h"
#include "tbb/parallel_do.h"
#include "tbb/concurrent_queue.h"

#define N 12

using namespace std;
using namespace tbb;

int X;
concurrent_queue CQ;

class Node {
	public:
		int data;
		Node *left;
		Node *right;

		Node() : left(NULL), right (NULL) {}

		typedef Node *argument_type;

		void operator()( argument_type c_Node, parallel_do_feeder &append_task ) const
		{
			/*
			 * Uncomment the following return to print (not find?) address of only one node
			 */
			if ( ! CQ.empty() ) {
				//return;
			}

			if ( c_Node == NULL ) {
				return;
			}

			if ( c_Node->data == X ) {
				printf ("		 %pn", (void *) c_Node);
				if ( CQ.empty() ) {	// concurrent_queue is used for thread-safe handling
					CQ.push(true);
				}
			}

			append_task.add( c_Node->left );
			append_task.add( c_Node->right );
		}
};

class BST {
	private:
		Node *head;

	public:
		BST(int x);

		void ins(int x);

		bool trav_find(int x);
};

BST::BST(int x)
{
	head = new Node;
	head->data = x;
	head->left = NULL;
	head->right = NULL;

	printf("New BST (head = %p, data = %d)n", (void *) head, x);
}

void BST::ins(int x)
{
	Node *c_Node = head;

	bool inserted = false;

	while ( ! inserted ) {
		if (x <= c_Node->data) {
			if ( c_Node->left == NULL ) {
				Node *n_Node = new Node;
				n_Node->data = x;
				c_Node->left = n_Node;
				printf ("   %d is left of %dn", x,c_Node->data);
				inserted = true;
			} else {
				c_Node = c_Node->left;
			}
		} else {
			if ( c_Node->right == NULL ) {
				Node *n_Node = new Node;
				n_Node->data = x;
				c_Node->right = n_Node;
				printf ("   %d is right of %dn", x,c_Node->data);
				inserted = true;
			} else {
				c_Node = c_Node->right;
			}
		}
	}
}

bool BST::trav_find(int x)
{
	X = x; // sets the value to be searched as a global
	CQ.clear();

	Node *ary[] = { head };

	parallel_do(ary, ary+1, Node());

	return (! CQ.empty());
}

int main()
{
	task_scheduler_init init(4);

	BST my_tree(N/2);

	for (unsigned int i=0; i<< endl;

	for ( int i=0; i<10; i++ ) {
		if ( my_tree.trav_find(i) ) {
			printf ("   %3d : presentnn", i);
		} else {
			printf ("   %3d : absentnn", i);
		}
	}

	return 0;
}

[/cpp]

0 Kudos
10 Replies
sinedie
Beginner
457 Views
Ok, I had been able to fix almost all the issues, bar one. Is there a way by which I can terminate the parallel_do the moment any thread satisfies a condition? For instance, in the above code, if any thread finds a node with a particular value ( in trav_find() ), is it possible for it to send a a signal to all active and pending tasks to quit/not start?

The modified code is:
[cpp]#include 
#include
#include "tbb/task_scheduler_init.h"
#include "tbb/parallel_do.h"
#include "tbb/concurrent_queue.h"

#define N 10

using namespace std;
using namespace tbb;

class Node {
public:
int data;
Node *left;
Node *right;

Node() : left(NULL), right (NULL) {}
};

class BST {
private:
Node *head;
int sX; // we shall search this number
concurrent_queue *CQ;

public:
BST(int x);

void ins(int x);

bool trav_find(BST &bst, int x);

typedef Node *argument_type;

void operator()( argument_type c_Node, parallel_do_feeder &append_task ) const
{
if ( c_Node == NULL ) {
return;
}

if ( c_Node->data == sX ) {
printf (" %pn", (void *) c_Node);
if ( (*CQ).empty() ) {
(*CQ).push(true);
}
}

append_task.add( c_Node->left );
append_task.add( c_Node->right );
}

};

BST::BST(int x)
{
head = new Node;
head->data = x;
head->left = NULL;
head->right = NULL;

CQ = new concurrent_queue;

printf("New BST (head = %p, data = %d)n", (void *) head, x);
}

void BST::ins(int x)
{
Node *c_Node = head;

bool inserted = false;

while ( ! inserted ) {
if (x <= c_Node->data) {
if ( c_Node->left == NULL ) {
Node *n_Node = new Node;
n_Node->data = x;
c_Node->left = n_Node;
printf (" %d is left of %dn", x,c_Node->data);
inserted = true;
} else {
c_Node = c_Node->left;
}
} else {
if ( c_Node->right == NULL ) {
Node *n_Node = new Node;
n_Node->data = x;
c_Node->right = n_Node;
printf (" %d is right of %dn", x,c_Node->data);
inserted = true;
} else {
c_Node = c_Node->right;
}
}
}
}

bool BST::trav_find(BST &bst, int x)
{
sX = x; // sets the value to be searched
(*CQ).clear(); // if this is not empty, then at least once the element was found

Node *ary[] = { head };

parallel_do(ary, ary+1, BST(bst));

return (! (*CQ).empty());
}

int main()
{
task_scheduler_init init(4);

BST my_tree(N/2);

for (unsigned int i=0; i<< endl;

for ( int i=0; i<10; i++ ) {
if ( my_tree.trav_find(my_tree, i) ) {
printf (" %3d : presentnn", i);
} else {
printf (" %3d : absentnn", i);
}
}

return 0;
}

[/cpp]
0 Kudos
Bartlomiej
New Contributor I
457 Views
Quoting - sinedie
Ok, I had been able to fix almost all the issues, bar one. Is there a way by which I can terminate the parallel_do the moment any thread satisfies a condition? For instance, in the above code, if any thread finds a node with a particular value ( in trav_find() ), is it possible for it to send a a signal to all active and pending tasks to quit/not start?
A shared flag seems like a good solution. You can either have a common inserted variable or an additional variable like that - which would be checked not in each iteration, but once a few iterations (and maintained by atomic operations). Myself, I'd recommend the second possibility, but itmight depend...
0 Kudos
sinedie
Beginner
457 Views
Quoting - bartlomiej
A shared flag seems like a good solution. You can either have a common inserted variable or an additional variable like that - which would be checked not in each iteration, but once a few iterations (and maintained by atomic operations). Myself, I'd recommend the second possibility, but itmight depend...

Yes, I did use a similar solution (as in the commented return in the original version of the program). But the problem is that all the added tasks scheduled to run will first have to run to terminate. Some overhead that I don't know if be can avoided...

I had similar problem with parallel_while too. Even after finding and even with explicit entry level check of the success it would still execute all the threads. A more fundamental way to control spawning of threads might have been better but parallel_while was depreicated anyway. It's just that such a memory makes me feel very cautious about setting up an extra counter, shared may be, as the overhead of starting a thread itself does not appear to be all that negligible.
0 Kudos
Bartlomiej
New Contributor I
457 Views
Well, I'm only learning TBB now, but I consider myself quite experienced in POSIX threads. There is a pthread_cancel() function that allows terminating a thread by another thread. And - it is considered a very bad practice to use this function.
When you cancel a thread you don't know in what state it is. It might be holding a lock or other resources.
Obviously, there can be some specific situations where using the pthread_cancel() function is safe and justified, but in general it is rude to kill someone just because you don't need his work anymore. ;-) It's better to just let him know, he can finish his job and let him make the things up on his own, e.g. deallocating the memory or releasing other resources.
So, by analogy, I don't suppose there is or will be an analog of the cancel() function for TBB - it would encourage bad programming practices.

0 Kudos
sinedie
Beginner
457 Views
Quoting - bartlomiej
Well, I'm only learning TBB now, but I consider myself quite experienced in POSIX threads. There is a pthread_cancel() function that allows terminating a thread by another thread. And - it is considered a very bad practice to use this function.
When you cancel a thread you don't know in what state it is. It might be holding a lock or other resources.
Obviously, there can be some specific situations where using the pthread_cancel() function is safe and justified, but in general it is rude to kill someone just because you don't need his work anymore. ;-) It's better to just let him know, he can finish his job and let him make the things up on his own, e.g. deallocating the memory or releasing other resources.
So, by analogy, I don't suppose there is or will be an analog of the cancel() function for TBB - it would encourage bad programming practices.


I guess you are right there, the same thought was bothering me too. I was thinking in terms of a suicide-destroyer. But that's just some crazy thought. :P
0 Kudos
robert-reed
Valued Contributor II
457 Views
Quoting - sinedie
I guess you are right there, the same thought was bothering me too. I was thinking in terms of a suicide-destroyer. But that's just some crazy thought. :P

Sinedie, I think what you're looking for in TBB is called the task_group_context, which will provide a group of tasks that can be cancelled together. Tasks that are currently executing would continue but there's so little work done in the BST functor that there should be no big delays there. It does seem to be rather creative overkill to use a concurrent_queue to hold the termination token when a simple global bool would suffice. However, this whole code snippet seems incomplete (what's "for(unsignedinti=0;i<

But definitely something on the order of a pthread_cancel() is out of the question: we're working with tasks here, not threads; we want to reuse our threads as much as possible since it's much more expensive to create and destroy them than it is for tasks.

0 Kudos
sinedie
Beginner
457 Views

Sinedie, I think what you're looking for in TBB is called the task_group_context, which will provide a group of tasks that can be cancelled together. Tasks that are currently executing would continue but there's so little work done in the BST functor that there should be no big delays there. It does seem to be rather creative overkill to use a concurrent_queue to hold the termination token when a simple global bool would suffice. However, this whole code snippet seems incomplete (what's "for(unsignedinti=0;i<

But definitely something on the order of a pthread_cancel() is out of the question: we're working with tasks here, not threads; we want to reuse our threads as much as possible since it's much more expensive to create and destroy them than it is for tasks.


Oops, the cut-pasting went wrong. The working code that I have is here:http://sites.google.com/site/samplecodesproject/tbb/algorithms/2-6-parallel_do---part-3

It is true that only the search operation used parallel_do() and that the purpose is too simple. as this example was to be used only as a demo to use for more complicated data structures. A collection of such simple codes that are primarily meant as working examples are here:http://sites.google.com/site/samplecodesproject/system/app/pages/sitemap/hierarchy. So yeah, I am open to simplifying the example even further if it can be done on a tree. I guess if I use the concurrent_queue to tore the addresses of the nodes it will look a bit more genuinely parallel. :) I am on vacation so I'll make these changes once I am back in office.

Regpthread_cancel(), so is it that the N threads created will continue to exist once their inital tasks are over by taking new tasks and this happens not just in the case of task stealing?

Thanks for the ref regtask_group_context. Some how I missed this topic...

0 Kudos
jimdempseyatthecove
Honored Contributor III
457 Views

Why not simply change your iterator?

[cpp]intptr_t searching; // ~0==searching, 0==found/stop searching
Node*    found;     // 0==not found, x==found node
...
if ( (c_Node & searching) == NULL ) {   
    return;   
   }   
  
if ( c_Node->data == X ) { 
     found = c_node;
     searching = NULL;
     return;  
   }   
[/cpp]

Remove the concurrent queue to hold the return (found) node. Return it back through the iterator.
Note, if multiple hits occur during the narrow window of finding X and setting searching=NULL then wouldn't the subsequent find be equally important as the first thread to find?

Jim
0 Kudos
Bartlomiej
New Contributor I
457 Views
Quoting - sinedie

Oops, the cut-pasting went wrong. The working code that I have is here:http://sites.google.com/site/samplecodesproject/tbb/algorithms/2-6-parallel_do---part-3

I haven't read the code carefully, before. Wouldn't it be better to add only one son of the node by the feeder (say, the right one) and process the other one in the same task - in a while or do...while loop ?

0 Kudos
jimdempseyatthecove
Honored Contributor III
457 Views
Quoting - bartlomiej

I haven't read the code carefully, before. Wouldn't it be better to add only one son of the node by the feeder (say, the right one) and process the other one in the same task - in a while or do...while loop ?


This would be the better route, as you traverse one leg direction spawn task(s) for the other leg direction (not both directions). This will halve the app code to TBB requests.

JD
0 Kudos
Reply