Processors
Intel® Processors, Tools, and Utilities
14402 Discussions

Inefficiencies with higher thread counts of tbb:parallel_reduce and my own implementation of it

Menu
Novice
875 Views

Hi All,

 

So I am trying to run a parallel sum over a rather large array of uint8_t's (big like 50 GB although this holds true at both higher and lower sizes aswell). I am trying to measure the performance of a handwritten parallel sum vs TBB's parallel reduce and their scalability when adding more threads. I did this by measuring the "speedup factor" (the time to reduce on n threads / time to reduce on 1 thread) of each reduce at each thread count. Theoretically, this value should be n if a reduce since parallel reduce is very parallel. So I have plotted the number of threads vs made available to the reduce vs the "speedup factor" of each reduce. Theoretically, if we scale perfectly we would expect this to be exactly 1:1, however it is not. It looks like this:

 

Screen Shot 2020-11-05 at 3.53.43 PM.png

 

See 'tbb no vectorization' and 'my no vectorization'. And this is running on dual-socket machine with 2 Intel(R) Xeon(R) Gold 5218R CPU @ 2.10GHz.

This seems to imply that there is some sort of bottleneck at higher core counts. It is also maybe worth noting that my implementation has very similar performance to tbb's in terms of raw performance and not just in speedup factor. I was suspicious that maybe some hyperthreading effect or prefetching effect or vectorization effect might be "roiding up" the single threading performance of both implementations, artificially making the curves look less linear. So I toggled turning those on and off and while it did make a little bit of a difference in the raw performance for the most part it didn't affect the speedup curves much. Which is suspicious. I also tried to check in Vtune to see what might be the bottleneck and this didn't turn up anything :(. This makes me think that there is some sort of bottleneck that is occurring in the hardware since both our implementations are achieving the same shaped curves? Are there any ideas?

0 Kudos
7 Replies
IntelSupport
Community Manager
849 Views

Hello Menu, 


Thank you for posting your question on this Intel® Community.


We understand that you are running some tests with handwritten parallel sum vs. TBB's (Intel® Threading Building Blocks) parallel reduce and scalability.


Since your tests are related to Intel® Threading Building Blocks, we would like to know if you are a Software developer. Could you please provide more details about this?


Are you currently writing an application and using a specific OS version?


Wanner G.

Intel Server Specialist


0 Kudos
Menu
Novice
843 Views

I am currently a student researcher at CMU (idk if that counts as a software developer) and yes I am currently writing an application for primarily Ubuntu but it also needs to work on Mac for development purposes.

 

Im not sure if you need more info on my role or the work I am doing but I can provide my code here

0 Kudos
IntelSupport
Community Manager
837 Views

Hello Menu, 


Thank you for your response.


Please allow us to review this information.


We will update this thread soon.


Wanner G.

Intel Server Specialist


0 Kudos
IntelSupport
Community Manager
827 Views

Hello Menu,


Thank you for waiting.


I would like to share with you that we provide support for your request on our online web support, please open a web ticket for Intel® Threading Building Blocks for Linux* on https://supporttickets.intel.com/servicecenter?lang=en-US to reach the correct support for your request.


Here he can verify the steps to open a web portal ticket:

https://software.intel.com/content/www/us/en/develop/articles/how-to-create-a-support-request-at-online-service-center.html.


Hope this helps.


Regards,

Leonardo C.


Intel Customer Support Technician


0 Kudos
IntelSupport
Community Manager
811 Views

Hello Menu,


I would like to know if you have been able to follow the steps to create a web ticket for your question.


Regards,

Leonardo C.


Intel Customer Support Technician


0 Kudos
Menu
Novice
804 Views

Yes, thanks. I submitted a ticket

0 Kudos
IntelSupport
Community Manager
797 Views

Hello Menu,


Thank you for confirming that you have followed the steps to open the case.


Regards,

Leonardo C.


Intel Customer Support Technician


0 Kudos
Reply