- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi. I wrote some code to compute scalar product(dot product) of two vectors:
When i run this code it sometimes return good results and sometimes bad :( Sometimes after 2 or 3 iterations alpha is computed wrong. Its random.
Those vectors are size eg. a[0].
Am i doing something wrong?
Is this can be caused because i use normal allocators (new , delete)?
I couldn't find proper use of scalable allocators :( can You post some exapmple please ?
Thank You
[cpp]class Scalar {
Matrix &my_a, &my_b;
double & result;
public:
void operator()( const tbb::blocked_range& r ) const {
Matrix& a = my_a;
Matrix& b = my_b;
double& c = result;
for( size_t i=r.begin(); i!=r.end(); ++i ){
c+=a.element[0]*b.element[0];
}
//std::cout << "c: " << c << std::endl;
result = c;
}
Scalar( double& r, Matrix& a, Matrix& b ) :
my_a(a), my_b(b), result
{}
};
void ParallelScalar( double& result, Matrix a, Matrix b){
parallel_for( tbb::blocked_range(0, a.rows, 16),
Scalar(result,a,b) );
}
int main( int argc, char* argv[] ) {
int M;
std::cout << "please enter size: ";
std::cin >> M; //size of matrix
tbb::task_scheduler_init init;
create_and_fill_matrixes(M, M); //
tbb::tick_count t0 = tbb::tick_count::now();
for(int i=0; i<5; ++i)
{
alfa=0;
ParallelScalar(alpha, a,c);
std::cout << "Alpha:" << alpha << std::endl;
}
tbb::tick_count t1 = tbb::tick_count::now();
std::cout << "Time: " << (t1-t0).seconds() << std::endl;
std::cout << "" << std::endl;
std::cout << "Beta:" << beta << std::endl;
return 0;
}
[/cpp]
When i run this code it sometimes return good results and sometimes bad :( Sometimes after 2 or 3 iterations alpha is computed wrong. Its random.
Those vectors are size eg. a
Am i doing something wrong?
Is this can be caused because i use normal allocators (new , delete)?
I couldn't find proper use of scalable allocators :( can You post some exapmple please ?
Thank You
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You have a race condition where the result could be updated by multiple threads. You'll want to protect access to the shared reference to alpha (result).
Instead of grabing the local copy of the global variable, modifying it, and then setting the global variable to the local variable's value you could just compute the local value and then update the global value atomically.
double c = 0;
// [compute c]
{
// update the global value
tbb::scoped_lock lock(global_mutex);
result += c;
}
You might also want to look at parallel_reduce. I like parallel_reduce for accumulating values because you don't need the lock. You modify local variables and then "join" the local value with another instance's local value. At the end you have the accumulated result. TBB takes care of the locking for you.
Instead of grabing the local copy of the global variable, modifying it, and then setting the global variable to the local variable's value you could just compute the local value and then update the global value atomically.
double c = 0;
// [compute c]
{
// update the global value
tbb::scoped_lock lock(global_mutex);
result += c;
}
You might also want to look at parallel_reduce. I like parallel_reduce for accumulating values because you don't need the lock. You modify local variables and then "join" the local value with another instance's local value. At the end you have the accumulated result. TBB takes care of the locking for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - adunsmoor
You have a race condition where the result could be updated by multiple threads. You'll want to protect access to the shared reference to alpha (result).
Quoting - adunsmoor
Instead of grabing the local copy of the global variable, modifying it, and then setting the global variable to the local variable's value you could just compute the local value and then update the global value atomically.
I did that because i need to sum up scalar product 5 times ^^ (more complicated example that i'm writing)Quoting - adunsmoor
You might also want to look at parallel_reduce. I like parallel_reduce for accumulating values because you don't need the lock. You modify local variables and then "join" the local value with another instance's local value. At the end you have the accumulated result. TBB takes care of the locking for you.
I tried parallel_reduce earlier to compute this scalar product but it works strange ^^
I did something like that:
[cpp]class Scalar { Matrix &my_a, &my_b; public: int c; void operator()( const tbb::blocked_rangethe result is wrong...& r ) { Matrix & a = my_a; Matrix & b = my_b; int tmp = c; for( size_t i=r.begin(); i!=r.end(); ++i ){ tmp+=a.element[0]*b.elementy[0]; } c = tmp; } Scalar( Matrix & a, Matrix & b ) : my_a(a), my_b(b) {} Scalar(Scalar& x, tbb::split) : my_a(x.my_a), my_b(x.my_b), c(0) {} void join(Scalar& y) { c +=y.c;} }; int ParallelScalar( Matrix a, Matrix b){ Scalar total(a,b); parallel_reduce( tbb::blocked_range (0, a.m, 16), total ); return total.c; } int main( int argc, char* argv[] ) { int M; std::cout << "please enter size: "; std::cin >> M; tbb::task_scheduler_init init(1); create_and_fill_matrixes(M, M); tbb::tick_count t0 = tbb::tick_count::now(); for(int i=0; i<5; ++i) { alpha=0; alpha = ParallelScalar(a,c); std::cout << "Alpha:" << alpha << std::endl; } tbb::tick_count t1 = tbb::tick_count::now(); std::cout << "Time: " << (t1-t0).seconds() << std::endl; std::cout << "" << std::endl; return 0; } [/cpp]
Could You help me?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One problem with your 2nd example is that c isn't initialized in the normal constructor. I compiled your example (with a little tweaking to fill in/remove the missing pieces). Before initializing c to 0 I got random results. Afterwards I got consistent results.
I changed the constructor as follows:
Scalar( Matrix & a, Matrix & b ) :
my_a(a), my_b(b), c(0)
{}
Also, you don't need to make a temporary variable in your operator(). You should be able to safely modify c directly.
Good luck. Happy to help if you have more specific questions.
I changed the constructor as follows:
Scalar( Matrix & a, Matrix & b ) :
my_a(a), my_b(b), c(0)
{}
Also, you don't need to make a temporary variable in your operator(). You should be able to safely modify c directly.
Good luck. Happy to help if you have more specific questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - adunsmoor
One problem with your 2nd example is that c isn't initialized in the normal constructor. I compiled your example (with a little tweaking to fill in/remove the missing pieces). Before initializing c to 0 I got random results. Afterwards I got consistent results.
I changed the constructor as follows:
Scalar( Matrix & a, Matrix & b ) :
my_a(a), my_b(b), c(0)
{}
Also, you don't need to make a temporary variable in your operator(). You should be able to safely modify c directly.
Good luck. Happy to help if you have more specific questions.
I changed the constructor as follows:
Scalar( Matrix & a, Matrix & b ) :
my_a(a), my_b(b), c(0)
{}
Also, you don't need to make a temporary variable in your operator(). You should be able to safely modify c directly.
Good luck. Happy to help if you have more specific questions.
i'm overworked... sorry for bother You with that...
Thx for help :) now its perfect ^^
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page