<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic mkl_dcsrsymv crashes in parallel mode in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791041#M2219</link>
    <description>When I am using mkl_dcsrsymv function called from C++ in MKL parallel mode, my program reports memory error and crashes. However, if I use MKL in sequential mode, the program works fine.&lt;BR /&gt;&lt;BR /&gt;How I can I work around this problem? I really need to run this function in parallel mode.&lt;BR /&gt;&lt;BR /&gt;Thank you very much in advance.&lt;BR /&gt;</description>
    <pubDate>Sat, 26 Jun 2010 22:52:35 GMT</pubDate>
    <dc:creator>Igor_Tsukanov</dc:creator>
    <dc:date>2010-06-26T22:52:35Z</dc:date>
    <item>
      <title>mkl_dcsrsymv crashes in parallel mode</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791041#M2219</link>
      <description>When I am using mkl_dcsrsymv function called from C++ in MKL parallel mode, my program reports memory error and crashes. However, if I use MKL in sequential mode, the program works fine.&lt;BR /&gt;&lt;BR /&gt;How I can I work around this problem? I really need to run this function in parallel mode.&lt;BR /&gt;&lt;BR /&gt;Thank you very much in advance.&lt;BR /&gt;</description>
      <pubDate>Sat, 26 Jun 2010 22:52:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791041#M2219</guid>
      <dc:creator>Igor_Tsukanov</dc:creator>
      <dc:date>2010-06-26T22:52:35Z</dc:date>
    </item>
    <item>
      <title>mkl_dcsrsymv crashes in parallel mode</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791042#M2220</link>
      <description>&lt;P&gt;Can you give us the example? This is an unexpected behavior.Actually, MKL is &lt;I&gt;thread-safe&lt;/I&gt;,
which means that all Intel MKL functions1 work correctly duringsimultaneous execution by
multiple threads.&lt;/P&gt;&lt;P&gt;--Gennady&lt;/P&gt;</description>
      <pubDate>Sun, 27 Jun 2010 16:43:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791042#M2220</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-06-27T16:43:53Z</dc:date>
    </item>
    <item>
      <title>mkl_dcsrsymv crashes in parallel mode</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791043#M2221</link>
      <description>Gennady,&lt;BR /&gt;thank you for yor response. Please find&lt;BR /&gt;below a modified conjugate gradient solver originally taken from Numerical Recipes.&lt;BR /&gt;A sparse matrix is stored in &lt;I&gt;sa[]&lt;/I&gt; array, &lt;I&gt;ija[]&lt;/I&gt; array is used to store column indices of the elements&lt;BR /&gt;and &lt;I&gt;p_row[]&lt;/I&gt; array contains indices of first non-zero element in a row. These are one based arrays&lt;BR /&gt;(artifact of Numerical Recipes) with zero index elements not used.&lt;BR /&gt;&lt;BR /&gt;This function produces correct results if MKL is used in a sequential mode. If I switch to parallel mode&lt;BR /&gt;the first call of mkl_dcsrsymv causes the memory access violation error.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#define USE_INTEL_MKL // use Intel MKL&lt;BR /&gt;#define EPS 1.0e-14&lt;BR /&gt;void matrixclass::linbcg_symmetric(MatrixIndexType n, double b[], double x[], int itol, double tol,&lt;BR /&gt; MatrixIndexType itmax, MatrixIndexType *iter, double *err)&lt;BR /&gt;{&lt;BR /&gt;// FAST VECTORIZED VERSION (tuned for this package) -- does not use A^T&lt;BR /&gt;&lt;BR /&gt; double snrm(MatrixIndexType n, double sx[], int itol);&lt;BR /&gt; double ak,akden,akden1,akden2,akden3,akden4,bk,bkden,bknum,bknum1,bknum2,bknum3,bknum4,bnrm;&lt;BR /&gt; double *p,*r,*z;&lt;BR /&gt; MatrixIndexType j,m,m1;&lt;BR /&gt; MatrixIndexType i;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt; #ifdef FM_SYMMETRIC_STORAGE_SCHEME_LOW_TRIANGLE&lt;BR /&gt;  char uplo = 'L';&lt;BR /&gt; #else&lt;BR /&gt;  char uplo = 'U';&lt;BR /&gt; #endif&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt; p = (double *)mkl_malloc( n * sizeof(double), MKL_ALLIGNMENT); p--;&lt;BR /&gt; r = (double *)mkl_malloc( n * sizeof(double), MKL_ALLIGNMENT); r--;&lt;BR /&gt; z = (double *)mkl_malloc( n * sizeof(double), MKL_ALLIGNMENT); z--;&lt;BR /&gt;#else&lt;BR /&gt; p=dvector(1,n);&lt;BR /&gt; r=dvector(1,n);&lt;BR /&gt; z=dvector(1,n);&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt;#ifdef FM_SYMMETRIC_STORAGE_SCHEME_LOW_TRIANGLE&lt;BR /&gt; atimes_symmetric(n,x,r,0);&lt;BR /&gt;#else&lt;BR /&gt; atimes_symmetric(n,x,r,1);&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt; m = n % 4;&lt;BR /&gt; m1 = m+1;&lt;BR /&gt; for ( j=1; j&amp;lt;=m; j++) &lt;BR /&gt; {&lt;BR /&gt;  r&lt;J&gt;=b&lt;J&gt;-r&lt;J&gt;;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; for ( j=m1; j&amp;lt;=n; j+=4) &lt;BR /&gt; {&lt;BR /&gt;  r&lt;J&gt; = b&lt;J&gt; -r&lt;J&gt;;&lt;BR /&gt;  r[j+1] = b[j+1]-r[j+1];&lt;BR /&gt;  r[j+2] = b[j+2]-r[j+2];&lt;BR /&gt;  r[j+3] = b[j+3]-r[j+3];&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;// atimes(n,r,rr,0); // minimal residual algorithm (check compatibility with symmetric matrix)&lt;BR /&gt;&lt;BR /&gt; bnrm=snrm(n,b,itol); &lt;BR /&gt;// For homogeneous systems *****&lt;BR /&gt; if( fabs(bnrm) &amp;lt; EPS )&lt;BR /&gt; {&lt;BR /&gt;  bnrm = 1.0;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;//******************************&lt;BR /&gt; asolve_new_symmetric(n,r,p);&lt;BR /&gt;//////////////////////////&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt; int incr = 1;&lt;BR /&gt; bknum = ddot((const int *)&amp;amp;n, &amp;amp;p[1], &amp;amp;incr, &amp;amp;r[1], &amp;amp;incr);&lt;BR /&gt;#else&lt;BR /&gt; bknum = 0.0;&lt;BR /&gt; bknum1 = 0.0;&lt;BR /&gt; bknum2 = 0.0;&lt;BR /&gt; bknum3 = 0.0;&lt;BR /&gt; bknum4 = 0.0;&lt;BR /&gt; for ( j=1; j&amp;lt;=m; j++)&lt;BR /&gt; {&lt;BR /&gt;  bknum += p&lt;J&gt;*r&lt;J&gt;;  // p instead of z&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; for ( j=m1; j&amp;lt;=n; j+=4)&lt;BR /&gt; {&lt;BR /&gt;  bknum1 += p&lt;J&gt; *r&lt;J&gt;;   // p instead of z&lt;BR /&gt;  bknum2 += p[j+1]*r[j+1];  // p instead of z&lt;BR /&gt;  bknum3 += p[j+2]*r[j+2];  // p instead of z&lt;BR /&gt;  bknum4 += p[j+3]*r[j+3];  // p instead of z&lt;BR /&gt; }&lt;BR /&gt; bknum += bknum1+bknum2+bknum3+bknum4;&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt; bkden = bknum;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt;&lt;B&gt;// ----------------------- Problem is here ------------------------------------------------------&amp;gt;&lt;BR /&gt; mkl_dcsrsymv((char *)&amp;amp;uplo, (int *)&amp;amp;n, &amp;amp;sa[1], (int *)&amp;amp;p_row[1], (int *)&amp;amp;ija[1], &amp;amp;p[1], &amp;amp;z[1]);&lt;BR /&gt;// &amp;lt;---------------------------------------------------------------------------------------------------&lt;/B&gt;&lt;BR /&gt; akden = ddot((const int *)&amp;amp;n, &amp;amp;z[1], &amp;amp;incr, &amp;amp;p[1], &amp;amp;incr);&lt;BR /&gt; ak=bknum/akden;&lt;BR /&gt;&lt;BR /&gt; daxpy((const int *)&amp;amp;n, &amp;amp;ak, &amp;amp;p[1], &amp;amp;incr, &amp;amp;x[1], &amp;amp;incr);&lt;BR /&gt; ak *= -1.0;&lt;BR /&gt; daxpy((const int *)&amp;amp;n, &amp;amp;ak, &amp;amp;z[1], &amp;amp;incr, &amp;amp;r[1], &amp;amp;incr);&lt;BR /&gt; ak *= -1.0;&lt;BR /&gt;#else&lt;BR /&gt; atimes_new_symmetric(n, p, z); // z and zz are computed. the rest of the code is unchanged&lt;BR /&gt; akden = 0.0;&lt;BR /&gt; akden1 = 0.0;&lt;BR /&gt; akden2 = 0.0;&lt;BR /&gt; akden3 = 0.0;&lt;BR /&gt; akden4 = 0.0;&lt;BR /&gt; for (j=1; j&amp;lt;=m; j++)&lt;BR /&gt; {&lt;BR /&gt;  akden += z&lt;J&gt;*p&lt;J&gt;;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; for (j=m1; j&amp;lt;=n; j+=4)&lt;BR /&gt; {&lt;BR /&gt;  akden1 += z&lt;J&gt;*p&lt;J&gt;;&lt;BR /&gt;  akden2 += z[j+1]*p[j+1];&lt;BR /&gt;  akden3 += z[j+2]*p[j+2];&lt;BR /&gt;  akden4 += z[j+3]*p[j+3];&lt;BR /&gt; }&lt;BR /&gt; akden += akden1 + akden2 + akden3 + akden4;&lt;BR /&gt;&lt;BR /&gt; ak=bknum/akden;&lt;BR /&gt;&lt;BR /&gt; for (j=1;j&amp;lt;=m;j++) &lt;BR /&gt; {&lt;BR /&gt;  x&lt;J&gt; += ak*p&lt;J&gt;;&lt;BR /&gt;  r&lt;J&gt; -= ak*z&lt;J&gt;;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; for (j=m1;j&amp;lt;=n;j+=4) &lt;BR /&gt; {&lt;BR /&gt;  x&lt;J&gt; += ak*p &lt;J&gt;;&lt;BR /&gt;  r&lt;J&gt; -= ak*z &lt;J&gt;;&lt;BR /&gt;&lt;BR /&gt;  x[j+1] += ak*p [j+1];&lt;BR /&gt;  r[j+1] -= ak*z [j+1];&lt;BR /&gt;&lt;BR /&gt;  x[j+2] += ak*p [j+2];&lt;BR /&gt;  r[j+2] -= ak*z [j+2];&lt;BR /&gt;&lt;BR /&gt;  x[j+3] += ak*p [j+3];&lt;BR /&gt;  r[j+3] -= ak*z [j+3];&lt;BR /&gt; }&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt; asolve_error_symmetric(n,r,z,err); // 12/12/2005&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;//////////////////////////&lt;BR /&gt; for( i=2; i &amp;lt;= itmax; i++) &lt;BR /&gt; {&lt;BR /&gt;  *iter = i;&lt;BR /&gt;  *err /= bnrm;&lt;BR /&gt;  if( (m_ProgressReportCallBack1 != NULL) &amp;amp;&amp;amp; (i % m_report_frequency == 0) )&lt;BR /&gt;  {&lt;BR /&gt;   m_ProgressReportCallBack1( i, *err );&lt;BR /&gt;  }&lt;BR /&gt;  else if( (m_ProgressReportCallBack2 != NULL) &amp;amp;&amp;amp; (i % m_report_frequency == 0) )&lt;BR /&gt;  {&lt;BR /&gt;   m_ProgressReportCallBack2( i, *err, physical_error_estimator(x, r, b, n) );&lt;BR /&gt;  }&lt;BR /&gt;  if (*err &amp;lt;= tol) break;&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt;  int incr = 1;&lt;BR /&gt;  bknum = ddot((const int *)&amp;amp;n, &amp;amp;z[1], &amp;amp;incr, &amp;amp;r[1], &amp;amp;incr);&lt;BR /&gt;#else&lt;BR /&gt;  bknum = 0.0;&lt;BR /&gt;  bknum1 = 0.0;&lt;BR /&gt;  bknum2 = 0.0;&lt;BR /&gt;  bknum3 = 0.0;&lt;BR /&gt;  bknum4 = 0.0;&lt;BR /&gt;  for ( j=1; j&amp;lt;=m; j++)&lt;BR /&gt;  {&lt;BR /&gt;   bknum += z&lt;J&gt;*r&lt;J&gt;;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for ( j=m1; j&amp;lt;=n; j+=4)&lt;BR /&gt;  {&lt;BR /&gt;   bknum1 += z&lt;J&gt;*r&lt;J&gt;;&lt;BR /&gt;   bknum2 += z[j+1]*r[j+1];&lt;BR /&gt;   bknum3 += z[j+2]*r[j+2];&lt;BR /&gt;   bknum4 += z[j+3]*r[j+3];&lt;BR /&gt;  }&lt;BR /&gt;  bknum += bknum1+bknum2+bknum3+bknum4;&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;  bk=bknum/bkden;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;  for (j=1;j&amp;lt;=m;j++) &lt;BR /&gt;  {&lt;BR /&gt;   p&lt;J&gt;=bk*p&lt;J&gt;+z&lt;J&gt;;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for (j=m1;j&amp;lt;=n;j+=4) &lt;BR /&gt;  {&lt;BR /&gt;   p&lt;J&gt; =bk*p &lt;J&gt; +z &lt;J&gt;;&lt;BR /&gt;   p[j+1] =bk*p [j+1] +z [j+1];&lt;BR /&gt;   p[j+2] =bk*p [j+2] +z [j+2];&lt;BR /&gt;   p[j+3] =bk*p [j+3] +z [j+3];&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  bkden=bknum;&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt;  mkl_dcsrsymv((char *)&amp;amp;uplo, (int *)&amp;amp;n, &amp;amp;sa[1], (int *)&amp;amp;p_row[1], (int *)&amp;amp;ija[1], &amp;amp;p[1], &amp;amp;z[1]);&lt;BR /&gt;  akden = ddot((const int *)&amp;amp;n, &amp;amp;z[1], &amp;amp;incr, &amp;amp;p[1], &amp;amp;incr);&lt;BR /&gt;  ak=bknum/akden;&lt;BR /&gt;  daxpy((const int *)&amp;amp;n, &amp;amp;ak, &amp;amp;p[1], &amp;amp;incr, &amp;amp;x[1], &amp;amp;incr);&lt;BR /&gt;  ak *= -1.0;&lt;BR /&gt;  daxpy((const int *)&amp;amp;n, &amp;amp;ak, &amp;amp;z[1], &amp;amp;incr, &amp;amp;r[1], &amp;amp;incr);&lt;BR /&gt;  ak *= -1.0;&lt;BR /&gt;#else&lt;BR /&gt;  atimes_new_symmetric(n, p, z);&lt;BR /&gt;&lt;BR /&gt;  akden = 0.0;&lt;BR /&gt;  akden1 = 0.0;&lt;BR /&gt;  akden2 = 0.0;&lt;BR /&gt;  akden3 = 0.0;&lt;BR /&gt;  akden4 = 0.0;&lt;BR /&gt;  for (j=1; j&amp;lt;=m; j++)&lt;BR /&gt;  {&lt;BR /&gt;   akden += z&lt;J&gt;*p&lt;J&gt;;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for (j=m1; j&amp;lt;=n; j+=4)&lt;BR /&gt;  {&lt;BR /&gt;   akden1 += z&lt;J&gt; *p&lt;J&gt;;&lt;BR /&gt;   akden2 += z[j+1]*p[j+1];&lt;BR /&gt;   akden3 += z[j+2]*p[j+2];&lt;BR /&gt;   akden4 += z[j+3]*p[j+3];&lt;BR /&gt;  }&lt;BR /&gt;  akden += akden1 + akden2 + akden3 + akden4;&lt;BR /&gt;  ak=bknum/akden;&lt;BR /&gt;  for (j=1;j&amp;lt;=m;j++) &lt;BR /&gt;  {&lt;BR /&gt;   x&lt;J&gt; += ak*p&lt;J&gt;;&lt;BR /&gt;   r&lt;J&gt; -= ak*z&lt;J&gt;;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for (j=m1;j&amp;lt;=n;j+=4) &lt;BR /&gt;  {&lt;BR /&gt;   x&lt;J&gt; += ak*p&lt;J&gt;;&lt;BR /&gt;   r&lt;J&gt; -= ak*z&lt;J&gt;;&lt;BR /&gt;&lt;BR /&gt;   x[j+1] += ak*p[j+1];&lt;BR /&gt;   r[j+1] -= ak*z[j+1];&lt;BR /&gt;&lt;BR /&gt;   x[j+2] += ak*p[j+2];&lt;BR /&gt;   r[j+2] -= ak*z[j+2];&lt;BR /&gt;&lt;BR /&gt;   x[j+3] += ak*p[j+3];&lt;BR /&gt;   r[j+3] -= ak*z[j+3];&lt;BR /&gt;  }&lt;BR /&gt;#endif&lt;BR /&gt;&lt;BR /&gt;  asolve_error_symmetric(n,r,z,err); // 12/12/2005&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; *iter -= 1;&lt;BR /&gt;&lt;BR /&gt; m_solution_error_Siemens = physical_error_estimator(x, r, b, n);&lt;BR /&gt;&lt;BR /&gt; if (*err &amp;lt;= tol)&lt;BR /&gt; {&lt;BR /&gt;  if( m_status_report_flag )&lt;BR /&gt;  {&lt;BR /&gt;#ifdef __FIELDMAGIG_64BIT_MATRIX_INDEX&lt;BR /&gt;   printf("%I64d iterations were performed. Accuracy of solution is %17.10le\n",*iter,*err);&lt;BR /&gt;#else&lt;BR /&gt;   printf("%d iterations were performed. Accuracy of solution is %17.10le\n",*iter,*err);&lt;BR /&gt;#endif&lt;BR /&gt;   printf("Accuracy of solution (energy criterion) is %17.10le\n",m_solution_error_Siemens);&lt;BR /&gt;  }&lt;BR /&gt; }&lt;BR /&gt; else&lt;BR /&gt; {&lt;BR /&gt;  if( m_status_report_flag )&lt;BR /&gt;  {&lt;BR /&gt;#ifdef __FIELDMAGIG_64BIT_MATRIX_INDEX&lt;BR /&gt;   printf("Slow convergence: maximum number (%I64d) of iterations was performed.\nAccuracy of solution is %17.10le\n",*iter,*err);&lt;BR /&gt;#else&lt;BR /&gt;   printf("Slow convergence: maximum number (%d) of iterations was performed.\nAccuracy of solution is %17.10le\n",*iter,*err);&lt;BR /&gt;#endif&lt;BR /&gt;   printf("Accuracy of solution (energy criterion) is %17.10le\n",m_solution_error_Siemens);&lt;BR /&gt;  }&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;#ifdef USE_INTEL_MKL&lt;BR /&gt; p++; r++; z++;&lt;BR /&gt; mkl_free( p );&lt;BR /&gt; mkl_free( r );&lt;BR /&gt; mkl_free( z );&lt;BR /&gt; mkl_free_buffers();&lt;BR /&gt;#else&lt;BR /&gt; free_dvector(p,1,n);&lt;BR /&gt; free_dvector(r,1,n);&lt;BR /&gt; free_dvector(z,1,n);&lt;BR /&gt;#endif&lt;BR /&gt;}&lt;BR /&gt;#undef EPS&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;And this is a function which I would like to replace with MKL function mkl_dcsrsymv:&lt;/B&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;void matrixclass::atimes_new_symmetric(MatrixIndexType n, double x[], double b[])&lt;BR /&gt;{&lt;BR /&gt;// performs multiplication A X &lt;BR /&gt;// optimized code for symmetric matrices ----&amp;gt;&lt;BR /&gt; MatrixIndexType i, j, k, k1, m, j1, j2, j3, j4;&lt;BR /&gt;&lt;BR /&gt; double xi1, s1b, s2b, s3b, s4b;&lt;BR /&gt;&lt;BR /&gt; m = n % 4;&lt;BR /&gt;&lt;BR /&gt; for( i=1; i&amp;lt;=n; i++ ) &lt;BR /&gt; {&lt;BR /&gt;  b &lt;I&gt; = 0.0;&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt;#ifdef FM_SYMMETRIC_STORAGE_SCHEME_LOW_TRIANGLE&lt;BR /&gt; for( i=1; i&amp;lt;=n; i++) &lt;BR /&gt; {&lt;BR /&gt;  k1 = p_row[i+1]-2; // exclude diagonal element&lt;BR /&gt;  m = p_row&lt;I&gt; + ((k1 - p_row&lt;I&gt; + 1) % 4)-1;&lt;BR /&gt;&lt;BR /&gt;  s1b = 0.0;&lt;BR /&gt;  s2b = 0.0;&lt;BR /&gt;  s3b = 0.0;&lt;BR /&gt;  s4b = 0.0;&lt;BR /&gt;  xi1 = x&lt;I&gt;;&lt;BR /&gt;&lt;BR /&gt;  for( k = p_row&lt;I&gt;; k &amp;lt;= m; k++ )&lt;BR /&gt;  {&lt;BR /&gt;   j=ija&lt;K&gt;;&lt;BR /&gt;   s1b += sa&lt;K&gt; * x&lt;J&gt;; // A&lt;BR /&gt;   b&lt;J&gt; += sa&lt;K&gt; * xi1;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for( k = m+1; k &amp;lt;= k1; k+=4 )&lt;BR /&gt;  {&lt;BR /&gt;   j1=ija&lt;K&gt;;&lt;BR /&gt;   j2=ija[k+1];&lt;BR /&gt;   j3=ija[k+2];&lt;BR /&gt;   j4=ija[k+3];&lt;BR /&gt;&lt;BR /&gt;   s1b += sa&lt;K&gt; * x [j1]; // A&lt;BR /&gt;   b [j1] += sa&lt;K&gt; * xi1;&lt;BR /&gt;&lt;BR /&gt;   s2b += sa[k+1] * x [j2]; // A&lt;BR /&gt;   b [j2] += sa[k+1] * xi1;&lt;BR /&gt;&lt;BR /&gt;   s3b += sa[k+2] * x [j3]; // A&lt;BR /&gt;   b [j3] += sa[k+2] * xi1;&lt;BR /&gt;&lt;BR /&gt;   s4b += sa[k+3] * x [j4]; // A&lt;BR /&gt;   b [j4] += sa[k+3] * xi1;&lt;BR /&gt;  }&lt;BR /&gt;  k = k1 + 1;&lt;BR /&gt;  j1=ija&lt;K&gt;;&lt;BR /&gt;  s1b += sa&lt;K&gt; * x [j1]; // diagonal element&lt;BR /&gt;  b &lt;I&gt; += s1b+s2b+s3b+s4b;&lt;BR /&gt; }&lt;BR /&gt;#else&lt;BR /&gt; for( i=1; i&amp;lt;=n; i++) &lt;BR /&gt; {&lt;BR /&gt;  k1 = p_row[i+1]-1;&lt;BR /&gt;&lt;BR /&gt;  m = p_row&lt;I&gt; + ((k1 - p_row&lt;I&gt;) % 4);&lt;BR /&gt;&lt;BR /&gt;  xi1 = x&lt;I&gt;;&lt;BR /&gt;&lt;BR /&gt;  k = p_row&lt;I&gt;;&lt;BR /&gt;  s1b = sa&lt;K&gt; * xi1; // diagonal element&lt;BR /&gt;  s2b = 0.0;&lt;BR /&gt;  s3b = 0.0;&lt;BR /&gt;  s4b = 0.0;&lt;BR /&gt;&lt;BR /&gt;  for( k = p_row&lt;I&gt;+1; k &amp;lt;= m; k++ ) // exclude diagonal element&lt;BR /&gt;  {&lt;BR /&gt;   j=ija&lt;K&gt;;&lt;BR /&gt;   s1b += sa&lt;K&gt; * x&lt;J&gt;; // A&lt;BR /&gt;   b&lt;J&gt; += sa&lt;K&gt; * xi1;&lt;BR /&gt;  }&lt;BR /&gt;&lt;BR /&gt;  for( k = m+1; k &amp;lt;= k1; k+=4 )&lt;BR /&gt;  {&lt;BR /&gt;   j1=ija&lt;K&gt;;&lt;BR /&gt;   j2=ija[k+1];&lt;BR /&gt;   j3=ija[k+2];&lt;BR /&gt;   j4=ija[k+3];&lt;BR /&gt;&lt;BR /&gt;   s1b += sa&lt;K&gt; * x [j1]; // A&lt;BR /&gt;   b [j1] += sa&lt;K&gt; * xi1;&lt;BR /&gt;&lt;BR /&gt;   s2b += sa[k+1] * x [j2]; // A&lt;BR /&gt;   b [j2] += sa[k+1] * xi1;&lt;BR /&gt;&lt;BR /&gt;   s3b += sa[k+2] * x [j3]; // A&lt;BR /&gt;   b [j3] += sa[k+2] * xi1;&lt;BR /&gt;&lt;BR /&gt;   s4b += sa[k+3] * x [j4]; // A&lt;BR /&gt;   b [j4] += sa[k+3] * xi1;&lt;BR /&gt;  }&lt;BR /&gt;  b &lt;I&gt; += s1b+s2b+s3b+s4b;&lt;BR /&gt; }&lt;BR /&gt;#endif&lt;BR /&gt;}&lt;BR /&gt;&lt;/I&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/J&gt;&lt;/J&gt;&lt;/K&gt;&lt;/K&gt;&lt;/I&gt;&lt;/K&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/K&gt;&lt;/J&gt;&lt;/J&gt;&lt;/K&gt;&lt;/K&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;&lt;/J&gt;</description>
      <pubDate>Sun, 27 Jun 2010 20:58:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791043#M2221</guid>
      <dc:creator>Igor_Tsukanov</dc:creator>
      <dc:date>2010-06-27T20:58:21Z</dc:date>
    </item>
    <item>
      <title>mkl_dcsrsymv crashes in parallel mode</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791044#M2222</link>
      <description>A fix for this was released in Intel MKL 10.2.6. -Todd</description>
      <pubDate>Tue, 26 Oct 2010 20:12:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/mkl-dcsrsymv-crashes-in-parallel-mode/m-p/791044#M2222</guid>
      <dc:creator>Todd_R_Intel</dc:creator>
      <dc:date>2010-10-26T20:12:44Z</dc:date>
    </item>
  </channel>
</rss>

