Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2472 Discussions

Using TBB Task Scheduler to calculate determinant of matrix

Sevic__Djordje
Beginner
491 Views

Having a problem with understanding why task scheduler makes program slower than its serial version is...

Here is serial version of my program which should calculate determinant of given matrix:

double serial(vector<vector<double>> matrix)
{
	double matrix_ord = matrix.size();
	double det = 0;

	if (matrix_ord == 1)
		det = matrix[0][0];
	else
	{
		for (int i = 0; i < matrix_ord; i++)
		{
			vector<vector<double>> minor = get_minor(matrix, i);

			bool sign = (i % 2 == 0) ? 1 : -1;
			det += matrix[0] * serial(minor) * sign;
		};
	}

	return det;
}

Because it is reccursive function, using TBB tasks i made Recursive chain reaction in parallel version, just like Fibonacci numbers example in TBB documentation:

First i am creating root task:

double parallel(vector<vector<double>> matrix)
{

	double det = 0;

	MyTask& my_task= *new(task::allocate_root()) MyTask(&det, matrix);
	task::spawn_root_and_wait(my_task);

	return det;
}

MyTask class with execute() function looks like this:

class MyTask: public task {
public:
	vector<vector<double>> matrix;
	double* det;

	MyTask(double* det_, vector<vector<double>> matrix_) : 
		det(det_), matrix(matrix_)  {}

	tbb::task* execute() 
	{
		int matrix_ord = matrica.size();

		if (matrix_ord == 1)
			*det= matrix[0][0]; 
		else
		{
			vector<double> determinants_of_minors(matrix_ord, 0);
			
			set_ref_count(matrix_ord+ 1);

			MyTask& first_task= *new (allocate_child()) MyTask(&determinants_of_minors.at(0), get_minor(matrix,0));
			
			for (int i = 1; i < matrix_ord; i++)
			{
				vector<vector<double>> minor = get_minor(matrix, i);

				MyTask& task_ = *new (allocate_child()) MyTask(&determinants_of_minors.at(i), minor);
				spawn(task_);
			}

			spawn_and_wait_for_all( first_task );

			for (int i = 0; i < matrix_ord; i++)
			{
				bool sign = (i % 2 == 0) ? 1 : -1;
				*det += matrix[0] * determinants_of_minors.at(i) * sign;
			}
		}

		return NULL;
	}
};

As you can see, the idea is to create subtasks for every minor of current matrix till matrix size is 1x1 -> then the task is returning the only element of matrix to its successor.

Where the problem could be?

0 Kudos
1 Reply
Aleksei_F_Intel
Employee
491 Views

Hi Djordje,

Fibonacci is an illustrative example. However, the explanation of it contains an essential part - "CutOff" parameter. I am sure that because of the program you provided does has anything to do with this parameter, it performs slower then the serial counterpart. The link to the right place in the docs: https://www.threadingbuildingblocks.org/docs/help/hh_goto.htm?index.htm#tbb_userguide/Simple_Example_Fibonacci_Numbers.html

Regards, Aleksei

0 Kudos
Reply