Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

30 performance

normrubin
Beginner
108 Views
I'm trying to compare the performance of a clustering code that I wrote in tbb with a widely used openmp
version. Currently the openmp version appears faster, and I'm trying to figure out why
As part of this investigation I switched from 2.2 to 3.0 (on the theory that newer is always better)
Oddly the 3.0 version seems to be slower.
I running vista 32 bit (for reasons that I care not to explain)
on 2.2 the 8 core machine shows 100% cpu busy while 3.0 seems to drop down to around 50%
I tried playing around with grain sizes and such which did move the cpu utilization up to around 80% but still less then 2.2
Does all this mean, I should go back to 2.2, because everyone knows 3.0 is not yet tuned for performance or
(and more likely?) did I mess up the install?
Has anyone else seen this performance drop? I'm using prebuilt binaries so at least I didn't build it wrong.
Thanks in advance for any help
0 Kudos
1 Reply
Anton_Pegushin
New Contributor II
108 Views
Hi, could you provide details on how TBB is used in the application: what types of parallel algorithms are used and if they're nested, how much work does each function object perform and if there's any I/O, synchronization, etc.? In case we're talking about a simple parallel_for, could you provide a reproducer for this 50% CPU utilization case? Thanks.
Reply