Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

loop speed up

What is the best method to increase speed up for a loop?

I have parallelize loop but speed up is < 1 infact the serial version is better than parallel version.

Someone can do an example where i can see the efficient of parallel_for ?

This is my code:

task_scheduler_init ( 5 ); -> maybe the error is here?
num_thread = 5;
chunk_inner=int((nel+Ndelay)/num_threads); -> maybe the error is here?
parallel_for (blocked_range(0,Ndelay+nel,chunk_inner), First_Loop (i,nel,direct_i,direct_q),simple_partitioner());

Thanks a lot
0 Kudos
1 Reply
Valued Contributor III
The first line creates a task_scheduler_init instance and immediately destroys it; maybe TBB forgets the 5 but I'm not sure. Instead, give the object a name (task_scheduler_init is a type, not a function), and no argument (let TBB figure out the number of threads by itself). Or simply omit the line and rely on implicit initialisation.

chunk_inner is a misnomer, because it is used as a grainsize. Letting it depend on number of worker threads is also bad practice (more small tasks leads to more parallel overhead); instead, make it a constant size.
0 Kudos