Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Strange reported statistics

Alexandru_I_
Beginner
525 Views

I am rather new to TBB and this might be a silly problem, but here goes. I have a very simple application (see code bellow) that creates a task tree in a recursive manner. I can control the number of tasks to be created (the depth of the tree) and the amount of work the leaf nodes execute. My problem is that the number of tasks I know this application creates and the number of tasks TBB reports (in the statistics.txt file) don't match. For instance, when I run the application with a tree depth of 5 (resulting in 32 tasks), the statistics.txt file records 63 total tasks. I even added a counter in the scheduler.cpp/generic_scheduler::local_spawn method and for the above exeample, this method is called 31 times (an extra call is done for the generic_scheduler::local_spawn_root_and_wait). I tried to understand how the TBB statistics are counted, but I failed. Can anybody tell why the big difference between the 2 counters?

 

using namespace std;
using namespace tbb;

typedef long long value;

int loop( int n )
{
int i;
long int s=0;

for( i=0; i<n; i++ )
{
s += i;
}

return s;
}

struct StressTask: public task {
value n;
int d;
// task arguments
StressTask( value n_, int d_ ) :
n(n_), d(d_)
{}
//! Execute task
task* execute() {
if( d > 0 ) {
StressTask& a = *new( allocate_child() ) StressTask( n, d-1 );
StressTask& b = *new( allocate_child() ) StressTask( n, d-1 );
set_ref_count(3);
spawn( a );
spawn_and_wait_for_all( b );
}
else
loop( n );
return NULL;
}
};

void stress(value n, int depth, int itr)
{
int i;
for( i=0; i<itr; i++)
{
StressTask& a = *new(task::allocate_root()) StressTask(n, depth);
task::spawn_root_and_wait(a);
}
printf( "DONE\n" );
}

int main(int argc, char* argv[])
{
value n;
int depth,itr,nth;
static tick_count t0;
if (argc<4)
printf("\nUsage: ./stress.tbb <number_of_operations> <depth_of_tree> <number_of_iterations> <number_of_threads>\n");
else{
n = strtol(argv[1],0,0);
depth = (int)strtol(argv[2],0,0);
itr =(int)strtol(argv[3],0,0);
nth = (int)strtol(argv[4],0,0);
task_scheduler_init scheduler_init(nth);
t0 = tick_count::now();
stress(n,depth,itr);
printf("\nExecution time: %f sec\n",(tick_count::now() - t0).seconds());
}
return 0;
}

0 Kudos
11 Replies
RafSchietekat
Valued Contributor III
525 Views
1+2+4+8+16+32=63?
0 Kudos
Alexandru_I_
Beginner
525 Views
I am not sure what do you mean with this. Are those the numbers of tasks for each level in the task tree? If so, you assume a tree depth of 6 while my example was for a task depth of 5. Also, as I stated in my first post, I counted the number of times the scheduler.cpp/generic_scheduler::local_spawn method was called. That result was in accordance with what I was expecting. The problem is that the statistics reported in the statistics.txt file are not in accordance with the other results. Can you account for this difference?
0 Kudos
SergeyKostrov
Valued Contributor II
525 Views
** Statement 1 ** >>...My problem is that the number of tasks I know this application creates and the number of tasks TBB reports (in the statistics.txt file) >>don't match... ** Statement 2 ** >>...The problem is that the statistics reported in the statistics.txt file are not in accordance with the other results... Could you attach your 'statistics.txt' file and a file with 'the other results'? You're talking about some differences in results and nobody could see or analyze it.
0 Kudos
RafSchietekat
Valued Contributor III
525 Views
The number of tasks created by this program is always 2^(max(0,depth)+1)-1 (the closed form of the sum illustrated above), so 63 is what you should expect when providing 5 as the argument. If you want "depth" to include the leaf tasks, you should probably pass depth-1 in stress(), but the total number of tasks will never be a (nontrivial) power of 2 like 32.
0 Kudos
Alexandru_I_
Beginner
525 Views
As Sergey requested, I am attaching the two outputs that I was mentioning. output.txt is the result of some printfs that I added in the scheduler.cpp/generic_scheduler::local_spawn to see how many times that method is called. It was created for an execution with a tree depth of 5 and you can see that the local_spawn method was called 31 times and the generic_scheduler::local_spawn_root_and_wait was called once. statistics.txt contains the results TBB provides through some counters. For the same example with tree depth of 5, this file reports 63 total executed tasks. This is the difference that I can't account for.
0 Kudos
RafSchietekat
Valued Contributor III
525 Views
You're not supposed to hack the scheduler without intimate knowledge of its operation (and why would you?). Meanwhile, you might try tracing before and after the calls in your program to see which internal TBB functions you forgot to log (probably triggered from spawn_and_wait_for_all()), but I would recommend taking this as a sign to stop trying at least until you are no longer "rather new to TBB" (never mind "knowing" that 32 tasks were spawned when the actual number is 63) and preferably beyond that point (unless you have a very good reason), rather than as a way to apply ad-hoc but possibly incomplete corrections (that may break down in other programs or newer TBB releases).
0 Kudos
SergeyKostrov
Valued Contributor II
525 Views
>> tasks executed stealing attempts arena market >>ID total w/o spawn succeeded failed conflicts backoffs switches roundtrips avg.conc avg.allot roundtrips >>W1 28 0 1 30 0 7 0 1 2.00 1.00 0 >>M 35 0 1 2 0 0 1 0 1.80 0.97 0 >>... the results TBB provides through some counters... I will take a look at your test-case. What did you use to get data from TBB counters?
0 Kudos
RafSchietekat
Valued Contributor III
525 Views
Instead of fixing the logging, what you also could do is split up spawn_and_wait_for_all() (into spawn() and wait_for_all()). This is not as efficient, strictly speaking, but it should get rid of the erroneous 32.
0 Kudos
Alexandru_I_
Beginner
525 Views
I think finally understand what is happening. The total number of tasks that are EXECUTED is 63, but the number of tasks that are SPAWNED and made available for stealing is 31. Like Raf suggested, the key is in the spawn_and_wait_for_all method. Essentially what it does is to execute the task without making it available for stealing. So in the end, what I was counting and what TBB was reporting were two different things and both correct. I am sorry if I wasted your time (and nerves) with this problem. In regard to Sergey's last post, what exactly are you asking? Do you want to know how to make TBB report those statistics? If that is the case, you need to make sure the __TBB_STATISTICS is defined as 1 in the src/tbb/tbb_statistics.h. In the same file there are some other variables that control the way stats are reported so maybe you will want to change their values also. Rebuild the library and recompile your application.
0 Kudos
RafSchietekat
Valued Contributor III
525 Views
As the function name "spawn_and_wait_for_all" indicates, its argument is still "spawned". That it happens to be executed immediately by the current thread is an implementation detail that is probably not guaranteed. (Added 2012-11-23) I should have stopped after the first sentence, because the Reference Manual says: "Furthermore, it guarantees that task is executed by the current thread. This constraint can sometimes simplify synchronization."
0 Kudos
SergeyKostrov
Valued Contributor II
525 Views
>>...Do you want to know how to make TBB report those statistics? If that is the case, you need to make sure the __TBB_STATISTICS is >>defined as 1 in the src/tbb/tbb_statistics.h... This is what I wanted to know.
0 Kudos
Reply