Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2456 Discussions

Poor tbb::task performance on an Intel Xeon(R) CPU E5- 2648L 0 @ 1.80GHz


In one of my projects, I spawn multiple tbb tasks to do a job and wait for it to get completed. The performance of the application is far better on my development machine (Intel Core(TM) i5-4590 CPU @ 3.30 GHz with 4 Cores) than the deployment machine (Intel Xeon(R) CPU E5- 2648L 0 @ 1.80GHz with 32 Cores). On the deployment machine having this multicore CPU, I was expecting a performance boost but results are quite opposite. 

According to this   article, on a very high level, a simple rule applies, 

More cores = more multitasking

Higher clock speed = faster task completion

On the deployment machine, the overall CPU is idle most of the time and the processing time is almost triple as compared to my development machine. As I understand the task scheduler does the load balancing and does not create threads for each task.

So, How do I squeeze that idle CPU resource and get the processing done in lesser time?  

0 Kudos
1 Reply

The simple supposition is that the application does not have enough work to utilize the bigger machine. How many tasks do you have? How big they are?

0 Kudos