- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am using Pardiso parallel solver from Intel MKL (cluster_sparse_solver) and I have some problems with achieving good efficiency. The computations are run on Intel Xeon E5-2650 v2 8C 2.6GHz CPUs (co-processors are not used). The sparse system has 888 195 DOF, I do not go to systems with bigger size due to specifications of the problem (the solution of the sparse system is small part of the whole algorithm which involves sparse and dense matrices). Because I have several solutions of sparse systems with cluster_sparse_solver, at the end of the algorithm I get bad scalability, because of the MKL solver.
I would like to ask if someone knows if the whole cluster_sparse_solver is implemented for parallel runs? From the output of the cluster_sparse_solver I see the the CPU time of the first part of the algorithm does not change with increase of the processors:
Strong scalability of cluster_sparse_solver, mesh with 296 065 nodes, system with 888195 DOF
number of processes 1 2 4 8 16 32 64
11 Analysis 5.9624 5.8961 5.8318 5.8405 5.8652 5.9099 6.0160
22 Numerical factorization 14.5922 7.7329 4.6724 2.9844 1.9513 1.5424 1.2885
33 Solve, iterative refinement 1.3819 0.8207 0.5122 0.3700 0.3089 0.2710 0.5799
I have contacted developers from Intel, but unfortunately I did not get an answer from them. I would be grateful if someone share his experience with scalability of cluster_sparse_solver.
Regards,
Stanley
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Stanley, we don't whom you contacted previously, but this is the right place to contact with us.
What mkl version have you tried?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Gennady, the version of MKL which I used to get the results is 11.3.2.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Stanislav, the fully distributed reordering in CPardiso has been added since MKL 2017 update 1 ( released Nov1st'16). You may take the eval version and check how it will work on your side with those workloads.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Gennady,
Thank you for your reply. My colleagues installed the new version of MKL (MKL 2017 update 1) and I checked again the scalability of cluster_sparse_solver. Unfortunately the results are similar to the ones with MKL 2016. I generated sparse matrix with higher dimension, currently I have 3.5 millions DOF. Here are the results of strong scalability of cluster_sparse_solver (1 OpenMP and several MPI processes).
number of processes 1 2 4 8 16
11 Analysis 26.4461 26.2817 25.8782 26.5270 26.0985
22 Numerical factorization 123.822 65.1132 38.7532 23.6410 16.1666
33 Solve, iterative refinement 4.7206 3.7750 2.5918 1.9842 1.9570
Again it seems that the first phase of the algorithm is not implemented for parallel computations.
I would like ask you if you can provide me with example where the first phase has scalability. Then I can modify my code to obtain also scalability in my algorithm.
Best regards,
Stanley
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Can you set iparm[1] to 10 and rerun test on last version of MKL?
Thanks,
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have similar problems that analysis step does not scale.
Also, I tried the advice given by Alex above, still the same.
Hope someone can help solving this problem.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page