We are running an application with (currently) the non-cluster PARDISO under Windows Server 2012 SP1, the machine is a HP DL980 with 160 logical cores (80 cores with HT turned on) and 2TB RAMs. We are having problems in utilizing all cores, most likely originated from the Windows "Core Group" limit of 64 cores each. I am happy to see now Intel offers the clustered version of PERDISO in MKL. Can this new version now handle multiple core groups as clusters of computers? Any special configuration requirement to use the new PARDISO? Or we have to reconfigure the machines with the clustering SW? Do I need to update my Fortran Composer license to the Clustered edition, or an update of current professional edition will do?
- for such big systems, try to use the new version of Pardiso for distributed systems - the official name is Parallel Direct Sparse Solver for Clusters which available since version of MKL 11.2.
- You don't need to update your Fortran Composer to the Cluster Edition because of Parallel Direct Sparse Solver for Clusters available into Fortran Composer Edition too. But it will work of distribute systems in the case if MPI available. Parallel Direct Sparse Solver for Clusters will work with Intel MPI, MPICH, OpenMPI or MVAPICH2.
- to launch the Sparse solver for Clusters you can use the ordinal approach for any mpi based application. the command line to lauch would looks like the follow mpiexec.exe -hosts <# of hosts> <host1_name> <host1 # of processes> <host2_name> <host2 # of processes> ... test.exe
- what the typical size of this application?
- The only way I can get the MPI functions is to update the Fortran Composer to a cluster version. Is there anyway I can use the cluster PARDISO without invoke MPI?
- When WinServer separates the logical cores into "core groups", it is done internally so to an outside user the host_1name ... is unknown. In that case, how can I enter the command? I am assuming you must have included examples on how to use the new functionality. Have you or your team ever tried the cluster solver on a Windows server with more than 64 cores (in that case the OS automatically split the cores into core groups)?
- The application size varies with the details of the physical model we are trying to simulate. But usually it needs from 100 to 700 GB of RAM each process, and the run time can be from 2 hours to 7 days.
-it's not possible to Cluster version without invoking MPI. You don't need to bye Intel MPI, just to try MPICH which is available for Windows, and it's free.
- yes, we released examples show how to call Sparse Solver for Cluster. you can find these examples into <mkl_root>\examples - there there are 2 folders : cluster_sparse_solverc cluster_sparse_solverf.
- set OMP_NUM_THREADS=40 and then try to launch mpiexec.exe -n 4 test.exe,
- what is the #of rows into your typical problem size, #of NNZ and type of matrices?