The best evaluation would be something that resembles the application which you plan to parallelize.
Also, unless youhave a lot ofshort parallel regions, the time spent in parallel runtimelibrary (its overhead) will likely be negligible comparing to real computations. What matters more for performanceare high level properties of the implementation such as data locality, amount of synchronization, load balancing etc.
I also suggest reading the article "Intel TBB, OpenMP, or native threads?" if you did not yet.
Thank you. I was contemplating building a scaled down version of the actual program, but peers thought it'd be better to try out small programs. Your reply gives me a better general direction to proceed.
Had seen the article earlier, and although the article says that OpenMP performs better for applications with a lot of array operations, my little array addition program could not prove it. TBB was faster.
This led me to conclude that I'm evaluating it incorrectly.
I can't go according to the claims on a website. I need to see numbers that prove to me that a program is really faster than another program.
Even when you say that data locality etc govern performance, is there a way to know for sure (by measuring) that TBB or OpenMP's optimization for load balancing or data locality really works?
Right now the only measurement tool I have is the program running time. Is there some other way to compare or am I in the wrong direction in the first place?
Below I've listed many documented examples of OpenMP knobs that affect performance
Search for these pages using the button on the upper left corner of the page:
OpenMP* Environment Variables
and in particular: KMP_AFFINITY, KMP_BLOCKTIME, KMP_LIBRARY, OMP_SCHEDULE, OMP_NUM_THREADS, OMP_DYNAMIC, OMP_NESTED, OMP_THREAD_LIMIT, OMP_MAX_ACTIVE_LEVELS
Thread Affinity Interface
This one addresses thread placement for multi-core and NUMA architectures (like Core i7 and Nehalem) and is very extensively documented.
OpenMP* Run-time Library Routines
Execution Environment Routines section in particular
Intel Extension Routines to OpenMP*
In particular see these sections: Execution Environment Routines, Memory Allocation, & Thread Sleep Time
OpenMP* Options Quick Reference
Note that all of these options have additional links to more complete descriptions.
OpenMP* Support Libraries
In particular the section marked Execution Modes, but the whole page may be useful if you want inter-compiler compatibility.
For more detailed information about the OpenMP standard environment variables, API routines, and directives and clauses, and many examples of usage, see the OpenMP specification v3.0 here:
Hope that helps.