Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
2421 Discussions

Poor tbb::task performance on an Intel Xeon(R) CPU E5- 2648L 0 @ 1.80GHz


In one of my projects, I spawn multiple tbb tasks to do a job and wait for it to get completed. The performance of the application is far better on my development machine (Intel Core(TM) i5-4590 CPU @ 3.30 GHz with 4 Cores) than the deployment machine (Intel Xeon(R) CPU E5- 2648L 0 @ 1.80GHz with 32 Cores). On the deployment machine having this multicore CPU, I was expecting a performance boost but results are quite opposite. 

According to this   article, on a very high level, a simple rule applies, 

More cores = more multitasking

Higher clock speed = faster task completion

On the deployment machine, the overall CPU is idle most of the time and the processing time is almost triple as compared to my development machine. As I understand the task scheduler does the load balancing and does not create threads for each task.

So, How do I squeeze that idle CPU resource and get the processing done in lesser time?  

0 Kudos
1 Reply

The simple supposition is that the application does not have enough work to utilize the bigger machine. How many tasks do you have? How big they are?