<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Cluster Sparse Solver with Distributed Matrix - reordering/memory problem in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249632#M30754</link>
    <description>&lt;P&gt;I am trying to solve the large symmetric matrix problem (n = &lt;FONT style="background-color: #ffffff;"&gt;647296&lt;/FONT&gt;, non-zeros = &lt;FONT style="background-color: #ffffff;"&gt;343145604&lt;/FONT&gt;) with Cluster Sparse Solver and a distibuted matrix (&lt;FONT style="background-color: #ffffff;"&gt;iParam[39] = 1&lt;/FONT&gt;). I built my test program with OpenMP threading and ILP64 used (icc 20.4). It is a very simple workflow. Each rank reads its part of matrix from test files, do reordering, factorizaton, back substitution, memory release, report final error.&lt;/P&gt;
&lt;P&gt;The solution looks properly as the results are able to be interpreted visually. However I have three observations:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Although I set iParam[34] = 1, I still get&amp;nbsp; as output:&lt;LI-CODE lang="none"&gt;=== CPARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
CPARDISO double precision computation is turned ON
Scaling is turned ON
Matching is turned ON
&lt;/LI-CODE&gt;&lt;/LI&gt;
&lt;LI&gt;Once the reordering is started there is allocation of a huge amount of memory on rank 0, 32GB (50GB in peaks) whereas the other ranks uses only 4-5GB. It is strange because DSS for this matrix requires 40GB of memory. I would suspect the distributed matrix approach to use about 40GB / 8 = 5GB per rank (&lt;FONT style="background-color: #ffffff;"&gt;iParam[1] = 10&lt;/FONT&gt;) in the uniform matrix element distribution case. In my case there is&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;41806 elements on rank 0 and 228854 elements on rank 7 (size rises with rank numbers) so first rank has the smallest portion of matrix.&lt;/FONT&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;FONT style="background-color: #ffffff;"&gt;The other strange thing is that report shows it spends 85% of total reordering time on memory allocation. It looks similar for cluster distributed matrix and DSS.&lt;/FONT&gt;&lt;LI-CODE lang="none"&gt;Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 5.879387 s
Time spent in reordering of the initial matrix (reorder)         : 0.004014 s
Time spent in symbolic factorization (symbfct)                   : 7.478648 s
Time spent in data preparations for factorization (parlist)      : 0.037476 s
Time spent in allocation of internal data structures (malloc)    : 263.670537 s
Time spent in additional calculations                            : 33.522760 s
Total time spent                                                 : 310.592822 s
&lt;/LI-CODE&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The test case is run with 8 linux (&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;RHEL7&lt;/SPAN&gt;) machines using Intel MPI 2019.9.&lt;/P&gt;
&lt;P&gt;Please check attached cpp for other settings. Unfortunately I can not upload matrix definitions as the size of zipped files is&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is one of the smallest cases I am working with. Another case with&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;610561374 non-zeros&amp;nbsp;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;(still &lt;/SPAN&gt;&lt;SPAN style="background-color: #ffffff; box-sizing: border-box; color: #555555; cursor: text; display: inline; float: none; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;n = &lt;/SPAN&gt;&lt;FONT style="background-color: #ffffff; box-sizing: border-box; color: #555555; font-family: &amp;amp;quot; intel-clear&amp;amp;quot;,&amp;amp;quot;tahoma&amp;amp;quot;,helvetica,&amp;amp;quot;helvetica&amp;amp;quot;,arial,sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;647296, the same matrix but denser&lt;/FONT&gt;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;) require 110GB on rank 0, 5GB on other ranks and 75GB for DSS. so this time cluster run is much more memory consuming. The case&lt;/SPAN&gt;&lt;/FONT&gt;&amp;nbsp;with&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;1179580274&lt;/FONT&gt; non-zeros craches with the allocation problem on 250GB machine.&lt;/P&gt;
&lt;P&gt;The question is if I do something wrong or is there a bug in libraries?&lt;/P&gt;</description>
    <pubDate>Mon, 25 Jan 2021 11:45:10 GMT</pubDate>
    <dc:creator>Milosz</dc:creator>
    <dc:date>2021-01-25T11:45:10Z</dc:date>
    <item>
      <title>Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249632#M30754</link>
      <description>&lt;P&gt;I am trying to solve the large symmetric matrix problem (n = &lt;FONT style="background-color: #ffffff;"&gt;647296&lt;/FONT&gt;, non-zeros = &lt;FONT style="background-color: #ffffff;"&gt;343145604&lt;/FONT&gt;) with Cluster Sparse Solver and a distibuted matrix (&lt;FONT style="background-color: #ffffff;"&gt;iParam[39] = 1&lt;/FONT&gt;). I built my test program with OpenMP threading and ILP64 used (icc 20.4). It is a very simple workflow. Each rank reads its part of matrix from test files, do reordering, factorizaton, back substitution, memory release, report final error.&lt;/P&gt;
&lt;P&gt;The solution looks properly as the results are able to be interpreted visually. However I have three observations:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Although I set iParam[34] = 1, I still get&amp;nbsp; as output:&lt;LI-CODE lang="none"&gt;=== CPARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
CPARDISO double precision computation is turned ON
Scaling is turned ON
Matching is turned ON
&lt;/LI-CODE&gt;&lt;/LI&gt;
&lt;LI&gt;Once the reordering is started there is allocation of a huge amount of memory on rank 0, 32GB (50GB in peaks) whereas the other ranks uses only 4-5GB. It is strange because DSS for this matrix requires 40GB of memory. I would suspect the distributed matrix approach to use about 40GB / 8 = 5GB per rank (&lt;FONT style="background-color: #ffffff;"&gt;iParam[1] = 10&lt;/FONT&gt;) in the uniform matrix element distribution case. In my case there is&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;41806 elements on rank 0 and 228854 elements on rank 7 (size rises with rank numbers) so first rank has the smallest portion of matrix.&lt;/FONT&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;FONT style="background-color: #ffffff;"&gt;The other strange thing is that report shows it spends 85% of total reordering time on memory allocation. It looks similar for cluster distributed matrix and DSS.&lt;/FONT&gt;&lt;LI-CODE lang="none"&gt;Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 5.879387 s
Time spent in reordering of the initial matrix (reorder)         : 0.004014 s
Time spent in symbolic factorization (symbfct)                   : 7.478648 s
Time spent in data preparations for factorization (parlist)      : 0.037476 s
Time spent in allocation of internal data structures (malloc)    : 263.670537 s
Time spent in additional calculations                            : 33.522760 s
Total time spent                                                 : 310.592822 s
&lt;/LI-CODE&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;The test case is run with 8 linux (&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;RHEL7&lt;/SPAN&gt;) machines using Intel MPI 2019.9.&lt;/P&gt;
&lt;P&gt;Please check attached cpp for other settings. Unfortunately I can not upload matrix definitions as the size of zipped files is&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is one of the smallest cases I am working with. Another case with&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;610561374 non-zeros&amp;nbsp;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;(still &lt;/SPAN&gt;&lt;SPAN style="background-color: #ffffff; box-sizing: border-box; color: #555555; cursor: text; display: inline; float: none; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;n = &lt;/SPAN&gt;&lt;FONT style="background-color: #ffffff; box-sizing: border-box; color: #555555; font-family: &amp;amp;quot; intel-clear&amp;amp;quot;,&amp;amp;quot;tahoma&amp;amp;quot;,helvetica,&amp;amp;quot;helvetica&amp;amp;quot;,arial,sans-serif; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;647296, the same matrix but denser&lt;/FONT&gt;&lt;SPAN style="display: inline !important; float: none; background-color: #ffffff; color: #555555; cursor: text; font-family: inherit; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 22px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;"&gt;) require 110GB on rank 0, 5GB on other ranks and 75GB for DSS. so this time cluster run is much more memory consuming. The case&lt;/SPAN&gt;&lt;/FONT&gt;&amp;nbsp;with&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;1179580274&lt;/FONT&gt; non-zeros craches with the allocation problem on 250GB machine.&lt;/P&gt;
&lt;P&gt;The question is if I do something wrong or is there a bug in libraries?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2021 11:45:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249632#M30754</guid>
      <dc:creator>Milosz</dc:creator>
      <dc:date>2021-01-25T11:45:10Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249744#M30756</link>
      <description>&lt;P&gt;Milosz,&lt;/P&gt;
&lt;P&gt;We need to have these inputs. You may create private threads and share this input with us.&lt;/P&gt;</description>
      <pubDate>Mon, 25 Jan 2021 18:23:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249744#M30756</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2021-01-25T18:23:37Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249875#M30759</link>
      <description>&lt;P&gt;Hello Milosz,&lt;/P&gt;
&lt;P&gt;While I also suggest that you share your matrix with us, it would help us more specific in our answers. &lt;BR /&gt;I'll try to answer some of your questions or ask for more details below.&lt;/P&gt;
&lt;P&gt;1. The message you see is a bit strange and it may very well be an error for the indexing in the output message. I believe the functionality itself works fine with both 0- and 1-based indexing.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2. Unless you use iparm[1] = 10 a non-distributed version of reordering is used, which means that only 1 MPI is doing it.&lt;BR /&gt;&lt;SPAN style="font-family: inherit;"&gt;I am not sure if I read it correctly. When you say that most of the elements are on process 0, this is not for iparm[1] = 10, right? And the second question from you is why for iparm[1] = 10 the nnz distribution is uneven?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Another option, which can reduce the reordering time/memory consumption is potentially VBSR format, see&amp;nbsp;iparm&lt;LI-WRAPPER&gt;&lt;SPAN&gt;[36]. I am not sure though if it works together with matching (iparm[12]=1).&lt;/SPAN&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;SPAN style="font-family: inherit;"&gt;3. This is strange. We need to check.&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;
&lt;P&gt;4. Memory consumption of DSS (I guess you mean the DSS API of PARDISO) vs. Cluster Sparse Solver: also needs a check anyway, but are you numbers the top memory consumption for one of the phases or the overall peak over all phases?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One suggestion from me: could you temporarily turn off the matching (set iparm[12]=0) and see how your observations change? This should make situation different, assuming you don't have intersections for the rows w.r.t to the distribution over MPI processes.&lt;/P&gt;
&lt;P&gt;Last, but not least: I officially recommend&lt;STRONG&gt; to stop using DSS API&lt;/STRONG&gt; for a non-distributed direct sparse solver. Please, switch to PARDISO API. It might not make a difference for many cases but for many other cases we have done and will do nice improvements available through PARDISO API which are not and will not be (likely) available via DSS API.&lt;/P&gt;
&lt;P&gt;Best,&amp;nbsp;&lt;BR /&gt;Kirill&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jan 2021 02:27:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1249875#M30759</guid>
      <dc:creator>Kirill_V_Intel</dc:creator>
      <dc:date>2021-01-26T02:27:24Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250006#M30762</link>
      <description>&lt;P&gt;Sorry for my previous post, it was written ad-hoc. Here is some additional info:&lt;BR /&gt;2. I though about uneven elements distribution in input matrix. This is due to my matrix build process (which is not a part of attached program). The non-uniform distribution of matrix rows over MPI processes gives usually uniform non-zeros distribution an workload for each process but not for this case. However this non uniform input is run with iparm[1] = 10, so reordering should be also distributed.&lt;BR /&gt;4. Sorry for not beeing specific, I meant of course PARDISO API. Regarding memory measurements, I would say these are peaks for phase 11.&lt;/P&gt;
&lt;P&gt;I have also run the test with turned off matching (iparm[12]=0). I found that this time memory consumption in phase 11 is lower and redistributed: rank 7 - 11GB, rank 3 - 8GB, other 1.2-2.5GB. Sum is 30GB. It increases to 42GB in phase 22 for sum but still beter distribution.&lt;/P&gt;
&lt;P&gt;All runs are done for&amp;nbsp;&lt;FONT style="background-color: #ffffff;"&gt;OMP_NUM_THREADS=2&lt;/FONT&gt; and 8 MPI processes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jan 2021 11:44:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250006#M30762</guid>
      <dc:creator>Milosz</dc:creator>
      <dc:date>2021-01-26T11:44:13Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250301#M30770</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Milozs, &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;We took the inputs you shared with us and will check the behavior. Which version of mkl do You use?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;-Gennady&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jan 2021 07:05:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250301#M30770</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2021-01-27T07:05:25Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250312#M30771</link>
      <description>&lt;P&gt;I use MKL 2020 u4.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Jan 2021 07:45:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250312#M30771</guid>
      <dc:creator>Milosz</dc:creator>
      <dc:date>2021-01-27T07:45:59Z</dc:date>
    </item>
    <item>
      <title>Re: Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250718#M30774</link>
      <description>&lt;P&gt;Hi Milosz,&lt;/P&gt;
&lt;P&gt;We've received your data. I've quickly checked and can confirm your findings 1. and 3. and partially 2. (I saw the disbalance but I haven't carefully estimated it).&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best,&lt;BR /&gt;Kirill&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jan 2021 05:29:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1250718#M30774</guid>
      <dc:creator>Kirill_V_Intel</dc:creator>
      <dc:date>2021-01-28T05:29:18Z</dc:date>
    </item>
    <item>
      <title>Re:Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1295744#M31662</link>
      <description>&lt;P&gt;Hi Milosz,&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;The MKL PARDISO was incorrectly calculating and reporting the "Time spent in allocation of internal data structures (malloc)". Almost all of that time was in fact spent in matching and scaling; a new output line has been added to reflect that info, "Time spent in matching/scaling". So the new output that will be displayed to the user will be:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Summary: ( reordering phase )&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;================&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Times:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;======&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in calculations of symmetric matrix portrait (fulladj): 11.541123 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in reordering of the initial matrix (reorder)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 0.004566 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in symbolic factorization (symbfct)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 10.948064 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in data preparations for factorization (parlist)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 0.043690 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in allocation of internal data structures (malloc)&amp;nbsp;&amp;nbsp;&amp;nbsp;: 0.069911 s&amp;nbsp;&amp;nbsp;&amp;lt;==== Notice the updated time&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in matching/scaling&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 345.403161 s &amp;lt;==== New output info added&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Time spent in additional calculations&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 40.577674 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Total time spent&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;: 408.588189 s&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;The fix of the issue available in the official release of MKL 2021.3&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Thanks,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: -apple-system; font-size: 10pt;"&gt;Gennady&lt;/SPAN&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Sun, 04 Jul 2021 06:08:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1295744#M31662</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2021-07-04T06:08:32Z</dc:date>
    </item>
    <item>
      <title>Re:Cluster Sparse Solver with Distributed Matrix - reordering/memory problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1297156#M31717</link>
      <description>&lt;P&gt;The issue is closing and we will no longer respond to this thread.&amp;nbsp;If you require additional assistance from Intel, please start a new thread.&amp;nbsp;Any further interaction in this thread will be considered community only.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 09 Jul 2021 02:53:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Cluster-Sparse-Solver-with-Distributed-Matrix-reordering-memory/m-p/1297156#M31717</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2021-07-09T02:53:14Z</dc:date>
    </item>
  </channel>
</rss>

