LAPACK vs ScaLAPACK

rabbitsoft — Tue, 30 Mar 2010 07:13:02 GMT

Hello,

I would like to ask which way in your opinion will be more optimal:

1) Use some grid tool to create virtual supercomputer from networked desktops and LAPACK functions from MKL (does LAPACK scale automaticly code to n processors/cores ? )

2) Use cluster created from networked desktops and ScaLAPACK functions with MPI

Thank you for anwer and best wishes

LAPACK vs ScaLAPACK

Sangamesh_B_ — Mon, 05 Apr 2010 05:40:20 GMT

Quoting rabbitsoft

For Option (1)

MKL works on threading. If the machines are SMP, then the job will scale to number of cores, the machine has. Ultimately your job will run on one machine and will have number of threads=number of cores. So clustering does not help you to get better peformance here.

For Option (2)

As you are using Scalapack and MPI here, the job will spread into multiple machines of the cluster. If machines are SMP boxes, you can use MPI + threading to get optimal performance. For example, if the cluster has 5 machines and each of them are an SMP of quadcore, then you can run the job with

(a) 5 mpi processes on 5 machines and 4 MKL threads per mpi process. Each machine will have 1 mpi process & 4 MKL threads.

(b) 10 mpi processes on 5 machines and 2 MKL threads per mpi process. Each machine will have 2 mpi process & 4 MKL threads.

Use the -machinefile option of mpirun/mpiexec to control number of mpi process on each machine and MKL_NUM_THREADS=<4 or 2> to control MKL threads.

So, I suggest you to go with Option (2) and evaluate the performance for both (a) & (b)

topic LAPACK vs ScaLAPACK in Intel® MPI Library

LAPACK vs ScaLAPACK

LAPACK vs ScaLAPACK