This question was asked during the webcast, An Introduction to High Performance Computing: Parallel Computing Issues. The presenter, Tom Lehmann, is the Advanced Projects Manager for the Enterprise Systems Group training organization at Intel. Here is Tom's answer to the question:Use the Intel Trace Collector and Intel Trace Analyzer to optimize MPI parallel performance. Use the Intel Threading Tools to debug and tune threaded applications. Use the Intel VTunepackageand the Intel performance libraries tooptimizeserial performance.
Message Edited by hagabb on 11-03-2004 01:38 PM
Does anyone else have suggestions?
Here is another question and answer from the webcast.What tools are available to help parallelize algorithms and software and analyze their performance?
A: One of the things that's built into all of the Intel compilers, for instance, is a thing called OpenMP. This will allow you to parallelize a serial program one chunk at a time if there is something to be parallelized. So for instance, one of the classic ways of making a program run faster is to take a loop and break it up into pieces. So if I've got a loop that goes one to 100 and I've got ten processors, I would do one to ten on the first processor, 11 to 20 on the next processor, and so forth for the rest of the processors. The OpenMP capabilities of the Intel compilers can do that for you more or less automatically. If you are looking at a multiprocessor system such as a cluster, the tools that you use to parallelize things are the MPI or Message Passing Interface that is available on the Web just about anywhere. You go to the Argonne National Lab and you can download a package called MPICH, which is a package that will allow you to build programs that can be spread out amongst the members of the cluster. Now once the program is put together and running, there are some tools available -- the ones that I'm most familiar with, of course, are the Intel Cluster Tools, a package called the Intel Trace Collector and the Intel Trace Analyzer. The Trace Collectorgathers all of the information as to how the individual pieces of the program spend their time on the computer itself and in the communication between computers. The Trace Analyzer gives you a visual display of what that MPI program is doing.It allows you to find communication bottlenecks in MPI programs very easily.So I would suggest you look into stuff like that. The old name for these programs was Vampir and Vampirtrace, from a company called Pallas in Germany. Intel recently acquired the Pallas organization that was developing Vampir and Vampirtrace.
Message Edited by hagabb on 01-21-2005 09:40 AM
BesidesMPICH from Argonne, another effective popular open source MPI is LAM MPI. Intel also has a fabric-independent MPI implementation that is based on MPICH2 calledthe Intel MPI Library.
Message Edited by hagabb on 01-21-2005 09:44 AM