OpenMP yields a pretty structured threading execution. This structure leads to a certain methodology for analyzing, identifying and correcting performance problems. With explicit threads, there is no structure enforced. Threads are able to interact as the programmer designs them and there are almost as many ways for threads to interact as there are programmers. With the release of the 2.0 Beta versions of the Intel Threading Tools, I wonder what features "real world" engineers would like to see in these tools.
This thread is devoted to examining the kinds of analysis thread programmers find useful and what tools are available that can provide (or promise, but don't easily provide) data to measure and improve execution performance. To get started, consider the following questions:
-- What interactions of threads with other threads, with synchronization objects, with processors, and with other parts of the execution platform do you think are important in order to measure performance?
-- Have you tried other tools to capture these metrics? Were they successful?
-- If you've tried the Intel Threading Tools, what was your experience with using them? Are there things that you would like to see added or modified?