<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: No scaling for mkl_dcsrsymv in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931632#M13790</link>
    <description>&lt;DIV&gt;&lt;/DIV&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I just learned from Intel premier support that the routine mkl_dcsrsymv has not yet been parallelized. So no need to look further.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Bernhard&lt;/P&gt;</description>
    <pubDate>Sun, 30 Apr 2006 22:02:39 GMT</pubDate>
    <dc:creator>admin4</dc:creator>
    <dc:date>2006-04-30T22:02:39Z</dc:date>
    <item>
      <title>No scaling for mkl_dcsrsymv</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931628#M13786</link>
      <description>&lt;DIV&gt;Hello,&lt;BR /&gt;I am wondering if the function mkl_dcsrsymv actually benefits from threading. Here is a snippet of code which performs a multiplication of a sparse symmetric matrix with a dense vector:&lt;BR /&gt;void spblas_multSymm(const int n, const int *const ptr, const int *const ind, const double *const val,&lt;BR /&gt;const double *const x, double *const y)&lt;BR /&gt;{&lt;BR /&gt;const char U = 'U';&lt;BR /&gt;double t0, t1, t_single, t_dual;&lt;BR /&gt;int iii;&lt;BR /&gt;omp_set_num_threads(1);&lt;BR /&gt;t0 = omp_get_wtime();&lt;BR /&gt;for(iii=0;iii&amp;lt;1000;iii++)&lt;BR /&gt;mkl_dcsrsymv(&amp;amp;U, &amp;amp;n, val, ptr, ind, x, y);&lt;BR /&gt;t1 = omp_get_wtime();&lt;BR /&gt;t_single = t1-t0;&lt;BR /&gt;omp_set_num_threads(2);&lt;BR /&gt;t0 = omp_get_wtime();&lt;BR /&gt;for(iii=0;iii&amp;lt;1000;iii++)&lt;BR /&gt;mkl_dcsrsymv(&amp;amp;U, &amp;amp;n, val, ptr, ind, x, y);&lt;BR /&gt;t1 = omp_get_wtime();&lt;BR /&gt;t_dual = t1-t0;&lt;BR /&gt;printf("Time for 1 Thread: %f
",t_single);&lt;BR /&gt;printf("Time for 2 Threads: %f 	 ratio %f
", t_dual,t_dual/t_single);&lt;BR /&gt;}&lt;BR /&gt;I used the for-loops to have more total computation time spent on the actual call to mkl_dcsrsymv.&lt;BR /&gt;For input data I used an LES of dimension 120 (i.e 120x120 dim matrix, but only about 5% nonzero). I obtained no speed-up at all, i.e. the ratio was close to 1. Further, looking at the task-manager the cpu-usage was only around 50 percent.&lt;BR /&gt;In a similar setting, I tested the dgemm routine which resulted in nearly ideal speed-ups with 100% cpu-usage. I also obtained good scaling for the PARDISO solver. Finally, i put a call to cblas_dgemm right in front of the calls to mkl_dcsrsymv and obtained nearly the optimal speedup for dgemm. This leads me to the assumption that the problem is really within the mkl_dcsrsymv function.&lt;BR /&gt;Do you have any ideas why this doesn't scale with mkl_dcsrsymv?&lt;BR /&gt;My environment is:&lt;BR /&gt;MKL 8.1&lt;BR /&gt;MS Visual Studio 2003 with Intel Compiler 9&lt;BR /&gt;WindowsXP Pro SP2&lt;BR /&gt;Athlon64x2 4400+&lt;BR /&gt;&lt;BR /&gt;Thank you very much for you comments,&lt;BR /&gt;Bernhard&lt;/DIV&gt;</description>
      <pubDate>Wed, 26 Apr 2006 21:22:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931628#M13786</guid>
      <dc:creator>admin4</dc:creator>
      <dc:date>2006-04-26T21:22:54Z</dc:date>
    </item>
    <item>
      <title>Re: No scaling for mkl_dcsrsymv</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931629#M13787</link>
      <description>Have you tried your experiment with much larger matrices?&lt;BR /&gt;&lt;BR /&gt;120 x120 is really very small. I would be very suprised to see any speed-up at that size and with so few sparse elements. For example, I am using the level 3 sparse routines with matrices of size 300,000 x 300,000. I believe I see &amp;gt;50% on my dual CPU machine.</description>
      <pubDate>Thu, 27 Apr 2006 22:44:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931629#M13787</guid>
      <dc:creator>AndrewC</dc:creator>
      <dc:date>2006-04-27T22:44:11Z</dc:date>
    </item>
    <item>
      <title>Re: No scaling for mkl_dcsrsymv</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931630#M13788</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;&lt;P&gt;Thanks for your answer,&lt;/P&gt;&lt;P&gt;you're right - this problem size is rather small. Actually, I mixed things up a bit: I have a grid consisting of 40x40 nodes with 3 DOFs each. Hence there are 4800 unknowns which renders the dimension of the LES's matrix 4800x4800. This is still much smaller than the problem you mentioned but I think I should get some speed-ups there. Maybe the level 2 Sparse BLAS routines aren't threaded at all?&lt;/P&gt;</description>
      <pubDate>Fri, 28 Apr 2006 14:39:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931630#M13788</guid>
      <dc:creator>admin4</dc:creator>
      <dc:date>2006-04-28T14:39:00Z</dc:date>
    </item>
    <item>
      <title>Re: No scaling for mkl_dcsrsymv</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931631#M13789</link>
      <description>I suppose what is critical is how many non-zero elements are present.The routine may not have enough NNZ to benefit from threading. Can you boost the problem size up a lot?&lt;BR /&gt;&lt;BR /&gt;I am using this very same routine in egienvalue extraction ( with ARPACK) of an acoustic solid elements vibration problem.&lt;BR /&gt;&lt;BR /&gt;Have a look at CSparse &lt;A href="http://www.cise.ufl.edu/research/sparse/CSparse/" target="_blank"&gt;http://www.cise.ufl.edu/research/sparse/CSparse/&lt;/A&gt; ( a package in 'C' by Tim Davis that I thoroughly recommend as useful toolkit for sparse matrices - it works well with MKL, [apart from having 0 based indexing]) and see how it does a sparse matrix/dense vector multiply.</description>
      <pubDate>Fri, 28 Apr 2006 21:24:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931631#M13789</guid>
      <dc:creator>AndrewC</dc:creator>
      <dc:date>2006-04-28T21:24:01Z</dc:date>
    </item>
    <item>
      <title>Re: No scaling for mkl_dcsrsymv</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931632#M13790</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I just learned from Intel premier support that the routine mkl_dcsrsymv has not yet been parallelized. So no need to look further.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Bernhard&lt;/P&gt;</description>
      <pubDate>Sun, 30 Apr 2006 22:02:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/No-scaling-for-mkl-dcsrsymv/m-p/931632#M13790</guid>
      <dc:creator>admin4</dc:creator>
      <dc:date>2006-04-30T22:02:39Z</dc:date>
    </item>
  </channel>
</rss>

