- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using parallel for to measure the performance gain relative to simple for loop version , but, I get correct result only when I use simple partitioner with grainsize 1 but it takes double time.
When I don't explicitly provide any partitioner and grainsize , it gives me correct expected value of count till n = 70 ,beyond that , it starts giving random values across different runs. I tried with removing inner loop as well , but that didn't help either. Can anyone tell me what am I missing here?
#include "tbb/tbb.h"
#include <iostream>
#include <string>
//#include <chrono>
#include <sstream>
#include <ctime>
#include <atomic>
#include <utility>
using namespace tbb;
using namespace std;
std::atomic<int> count(0);
void foo(const tbb::blocked_range<int>& range ){
for (int i = 0 ; i < 10000; ++i)
{
string l_czTempStr;
std::ostringstream oss;
oss << "Test data1";
oss << "Test data2";
oss << "Test data3";
l_czTempStr = oss.str();
::count++;
// ::count.fetch_add(1,memory_order_release);
}
}
int main()
{
cout <<"hello" <<std::endl;
int n = 1000;
clock_t tStart = clock(); //clock start time
tick_count t0 = tick_count::now();
for(int j=1;j<=n;j++) {
::count = 0;
tick_count t2 = tick_count::now();
tbb::parallel_for(tbb::blocked_range<int>(0,n,j), [&](const tbb::blocked_range<int>& range){
foo(range);
},tbb::simple_partitioner());
tick_count t3 = tick_count::now();
cout<< "grainsize: "<< j << " count:" <<::count << " time: "<< (t3-t2).seconds() <<endl;
}
cout << "gs done" <<endl;
// parallel_for<size_t>( 1, 10, 1, foo );
tick_count t1 = tick_count::now();
printf("work took %g seconds\n",(t1-t0).seconds());
cout<<(double)(clock() - tStart)/CLOCKS_PER_SEC*1000<<endl; //wall time total
cout << "count - " << ::count <<endl;
cout << "is lock free - " << ::count.is_lock_free() <<endl;
return 0;
}
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can please anyone help me with this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello,
you do not know how many times foo() is called because of dynamic balancing. So for N calls of foo() you should get count=N*10000.
usually if you need the same result you need to use blocked range and not just declare it. Instead of
void foo(const tbb::blocked_range<int>& range ){ for (int i = 0 ; i < 10000; ++i) { ... }
try
void foo(const tbb::blocked_range<int>& range ){ for (int i = range.begin() ; i < range.end(); ++i) { ... }
Vladimir
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page