Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

loop speed up

What is the best method to increase speed up for a loop?

I have parallelize loop but speed up is < 1 infact the serial version is better than parallel version.

Someone can do an example where i can see the efficient of parallel_for ?

This is my code:

task_scheduler_init ( 5 ); -> maybe the error is here?
num_thread = 5;
chunk_inner=int((nel+Ndelay)/num_threads); -> maybe the error is here?
parallel_for (blocked_range(0,Ndelay+nel,chunk_inner), First_Loop (i,nel,direct_i,direct_q),simple_partitioner());

Thanks a lot
0 Kudos
1 Reply
Black Belt
The first line creates a task_scheduler_init instance and immediately destroys it; maybe TBB forgets the 5 but I'm not sure. Instead, give the object a name (task_scheduler_init is a type, not a function), and no argument (let TBB figure out the number of threads by itself). Or simply omit the line and rely on implicit initialisation.

chunk_inner is a misnomer, because it is used as a grainsize. Letting it depend on number of worker threads is also bad practice (more small tasks leads to more parallel overhead); instead, make it a constant size.