I am developing an MPI application for a cluster of SMP systems. It is built around the MKL cluster FFT
routines. Can these routines take advantage of OpenMP threads, or should I start multiple MPI processes
on each SMP system?
Can these routines take advantage of OpenMP threads -
not at this moment. We are going to implement this opportunityin the nearest version of MKL.
should I start multiple MPI processes
on each SMP system? What MPI version are you going to use? if we are talking about Intel MPI version, then we recommend you to use version 3.2.1 which really support the hybridmode.
As Gennady mentioned, current implementation performs best if only MPI parallelization is used. Support for MPI+OpenMP parallelization will be added in the nearest future.