<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Unexpected NaN with clbas_dgemm in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050849#M21162</link>
    <description>&lt;P&gt;I am calculating a correlation matrix and am getting a weird issue. I have a 15x6 matrix R getting read into a subroutine correlation.c which calculates a 6x6 correlation matrix cor based off of the covariance matrix of R. My code looks like this&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
#include &amp;lt;math.h&amp;gt;
#include &amp;lt;mkl.h&amp;gt;
#include "params.h"

void show_matrix(REAL *A, int n_dim, int m_dim);

REAL *correlation(REAL *A, int n_dim, int m_dim)
{
   int diag;
   REAL *cvm = (REAL*)_mm_malloc(sizeof(REAL) * n_dim * n_dim, 64);
   REAL *cor = (REAL*)_mm_malloc(sizeof(REAL) * n_dim * n_dim, 64);
   REAL *colavg = (REAL*)_mm_malloc(sizeof(REAL) * n_dim, 64);

   if (cvm == NULL || colavg == NULL)
   {
      printf("\nERROR: Cannot allocate memory for matrices in CORRELATION. Aborting...\n\n");
      exit(EXIT_FAILURE);
   }
   else
   {
      memset(cvm,0.0,sizeof(cvm));
      memset(cor,0.0,sizeof(cor));
      memset(colavg,0.0,sizeof(colavg));
   }

   /* Calculate column averages */
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      for (int j = 0; j &amp;lt; m_dim; j++)
      {
         colavg&lt;I&gt; += A[j * n_dim + i];
      }
      colavg&lt;I&gt; /= m_dim;
   }

   /* Calculate covariance matrix */
   for (int i = 0; i &amp;lt; m_dim; i++)
   {
      int indx = i * n_dim;
      for (int j = 0; j &amp;lt; n_dim; j++)
      {
         A[indx + j] -= colavg&lt;J&gt;;
      }
   }
   printf("A in\n");
   show_matrix(A, m_dim, n_dim);
   cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, n_dim, n_dim, m_dim, 1.0, A, n_dim, A, n_dim, 1.0, cvm, n_dim);
   printf("cvm out\n");
   show_matrix(cvm, n_dim, n_dim);
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      int indx = i * n_dim;
      for (int j = 0; j &amp;lt; n_dim; j++)
      {
         cvm[indx + j] /= (m_dim - 1);
      }
   }
   printf("cvm final\n");
   show_matrix(cvm, n_dim, n_dim);

   /* Calculate correlation matrix */
   diag = n_dim + 1;
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      int indx = i * n_dim;
      int idiag = i * diag;
      for (int j = 0; j &amp;lt;= i; j++)
      {
         cor[indx + j] = cvm[indx + j] / sqrt(cvm[idiag] * cvm[j * diag]);
      }
   }
   printf("cor\n");
   show_matrix(cor, n_dim, n_dim);

   return cor;&lt;/J&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;The "A in" debug print gives my expected input of&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;A in
1.53400  1.53400  -1.53400  -1.53400  0.48900  -0.31900
-0.88700  -0.48900  0.88700  -0.88700  -0.15700  0.67400
-0.48900  0.67400  -0.48900  1.15000  1.53400  -0.48900
0.88700  0.00000  -0.67400  0.31900  -0.00000  -1.53400
1.15000  -0.31900  0.48900  0.67400  0.15700  1.15000
0.15700  -1.53400  -0.88700  -0.67400  -0.31900  0.15700
-1.15000  -0.67400  -0.15700  0.15700  -1.53400  -0.15700
0.00000  -0.88700  0.15700  -0.31900  -0.67400  0.88700
0.31900  -0.15700  0.67400  0.88700  0.67400  1.53400
-0.31900  0.15700  -0.31900  -1.15000  1.15000  -0.88700
-1.53400  0.88700  1.15000  1.53400  -0.48900  -1.15000
-0.15700  -1.15000  1.53400  -0.15700  -1.15000  -0.67400
0.48900  0.48900  -1.15000  0.48900  -0.88700  -0.00000
0.67400  0.31900  0.31900  0.00000  0.88700  0.31900
-0.67400  1.15000  0.00000  -0.48900  0.31900  0.48900&lt;/PRE&gt;

&lt;P&gt;However the "cvm out" debug print gives something a little bit goofy&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;cvm out
10.56446  1.02348  -4.93052  -2.46705  2.76175  1.84661
1.02348  10.56446  -3.30580  0.74990  -nan  -2.39809
-4.93052  -3.30580  10.56446  3.56732  -2.08079  2.00943
-2.46705  0.74990  3.56732  10.56446  -0.43575  -0.31520
2.76175  5.11138  -2.08079  -0.43575  10.56446  0.55179
1.84661  -2.39809  2.00943  -0.31520  0.55179  10.56446&lt;/PRE&gt;

&lt;P&gt;Why is that NaN there?! The symmetric value across the diagonal is correct! Very weird!&lt;/P&gt;

&lt;P&gt;I am currently working around it by working only on the lower half, which works due to the matrix being SPD, but I'm very curious about this rogue NaN.&lt;/P&gt;

&lt;P&gt;Any ideas?&lt;/P&gt;</description>
    <pubDate>Tue, 23 Jun 2015 14:23:35 GMT</pubDate>
    <dc:creator>Matt_S_</dc:creator>
    <dc:date>2015-06-23T14:23:35Z</dc:date>
    <item>
      <title>Unexpected NaN with clbas_dgemm</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050849#M21162</link>
      <description>&lt;P&gt;I am calculating a correlation matrix and am getting a weird issue. I have a 15x6 matrix R getting read into a subroutine correlation.c which calculates a 6x6 correlation matrix cor based off of the covariance matrix of R. My code looks like this&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#include &amp;lt;stdio.h&amp;gt;
#include &amp;lt;stdlib.h&amp;gt;
#include &amp;lt;string.h&amp;gt;
#include &amp;lt;math.h&amp;gt;
#include &amp;lt;mkl.h&amp;gt;
#include "params.h"

void show_matrix(REAL *A, int n_dim, int m_dim);

REAL *correlation(REAL *A, int n_dim, int m_dim)
{
   int diag;
   REAL *cvm = (REAL*)_mm_malloc(sizeof(REAL) * n_dim * n_dim, 64);
   REAL *cor = (REAL*)_mm_malloc(sizeof(REAL) * n_dim * n_dim, 64);
   REAL *colavg = (REAL*)_mm_malloc(sizeof(REAL) * n_dim, 64);

   if (cvm == NULL || colavg == NULL)
   {
      printf("\nERROR: Cannot allocate memory for matrices in CORRELATION. Aborting...\n\n");
      exit(EXIT_FAILURE);
   }
   else
   {
      memset(cvm,0.0,sizeof(cvm));
      memset(cor,0.0,sizeof(cor));
      memset(colavg,0.0,sizeof(colavg));
   }

   /* Calculate column averages */
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      for (int j = 0; j &amp;lt; m_dim; j++)
      {
         colavg&lt;I&gt; += A[j * n_dim + i];
      }
      colavg&lt;I&gt; /= m_dim;
   }

   /* Calculate covariance matrix */
   for (int i = 0; i &amp;lt; m_dim; i++)
   {
      int indx = i * n_dim;
      for (int j = 0; j &amp;lt; n_dim; j++)
      {
         A[indx + j] -= colavg&lt;J&gt;;
      }
   }
   printf("A in\n");
   show_matrix(A, m_dim, n_dim);
   cblas_dgemm(CblasRowMajor, CblasTrans, CblasNoTrans, n_dim, n_dim, m_dim, 1.0, A, n_dim, A, n_dim, 1.0, cvm, n_dim);
   printf("cvm out\n");
   show_matrix(cvm, n_dim, n_dim);
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      int indx = i * n_dim;
      for (int j = 0; j &amp;lt; n_dim; j++)
      {
         cvm[indx + j] /= (m_dim - 1);
      }
   }
   printf("cvm final\n");
   show_matrix(cvm, n_dim, n_dim);

   /* Calculate correlation matrix */
   diag = n_dim + 1;
   for (int i = 0; i &amp;lt; n_dim; i++)
   {
      int indx = i * n_dim;
      int idiag = i * diag;
      for (int j = 0; j &amp;lt;= i; j++)
      {
         cor[indx + j] = cvm[indx + j] / sqrt(cvm[idiag] * cvm[j * diag]);
      }
   }
   printf("cor\n");
   show_matrix(cor, n_dim, n_dim);

   return cor;&lt;/J&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;The "A in" debug print gives my expected input of&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;A in
1.53400  1.53400  -1.53400  -1.53400  0.48900  -0.31900
-0.88700  -0.48900  0.88700  -0.88700  -0.15700  0.67400
-0.48900  0.67400  -0.48900  1.15000  1.53400  -0.48900
0.88700  0.00000  -0.67400  0.31900  -0.00000  -1.53400
1.15000  -0.31900  0.48900  0.67400  0.15700  1.15000
0.15700  -1.53400  -0.88700  -0.67400  -0.31900  0.15700
-1.15000  -0.67400  -0.15700  0.15700  -1.53400  -0.15700
0.00000  -0.88700  0.15700  -0.31900  -0.67400  0.88700
0.31900  -0.15700  0.67400  0.88700  0.67400  1.53400
-0.31900  0.15700  -0.31900  -1.15000  1.15000  -0.88700
-1.53400  0.88700  1.15000  1.53400  -0.48900  -1.15000
-0.15700  -1.15000  1.53400  -0.15700  -1.15000  -0.67400
0.48900  0.48900  -1.15000  0.48900  -0.88700  -0.00000
0.67400  0.31900  0.31900  0.00000  0.88700  0.31900
-0.67400  1.15000  0.00000  -0.48900  0.31900  0.48900&lt;/PRE&gt;

&lt;P&gt;However the "cvm out" debug print gives something a little bit goofy&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;cvm out
10.56446  1.02348  -4.93052  -2.46705  2.76175  1.84661
1.02348  10.56446  -3.30580  0.74990  -nan  -2.39809
-4.93052  -3.30580  10.56446  3.56732  -2.08079  2.00943
-2.46705  0.74990  3.56732  10.56446  -0.43575  -0.31520
2.76175  5.11138  -2.08079  -0.43575  10.56446  0.55179
1.84661  -2.39809  2.00943  -0.31520  0.55179  10.56446&lt;/PRE&gt;

&lt;P&gt;Why is that NaN there?! The symmetric value across the diagonal is correct! Very weird!&lt;/P&gt;

&lt;P&gt;I am currently working around it by working only on the lower half, which works due to the matrix being SPD, but I'm very curious about this rogue NaN.&lt;/P&gt;

&lt;P&gt;Any ideas?&lt;/P&gt;</description>
      <pubDate>Tue, 23 Jun 2015 14:23:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050849#M21162</guid>
      <dc:creator>Matt_S_</dc:creator>
      <dc:date>2015-06-23T14:23:35Z</dc:date>
    </item>
    <item>
      <title>Matt,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050850#M21163</link>
      <description>&lt;P&gt;Matt,&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Using the same "A in" matrix and cblas_dgemm() call I was not able to reproduce the NaN. Note that I calculated "cvm" correctly only when "A" and "cvm" were of type double*. Can you provide any more information on the REAL type you are using? You might want try the same dgemm call with "A" and "cvm" as double* to see if it’s a type issue and if the NaN persists. Other than that, it would be helpful to know a little more about your environment: what MKL version are you using, are you using threading, what is your system/OS ... etc.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Anthony&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jun 2015 16:34:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050850#M21163</guid>
      <dc:creator>Anthony_H_Intel1</dc:creator>
      <dc:date>2015-06-24T16:34:52Z</dc:date>
    </item>
    <item>
      <title>Quote:Anthony H. (Intel)</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050851#M21164</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Anthony H. (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Matt,&lt;/P&gt;

&lt;P&gt;Using the same "A in" matrix and cblas_dgemm() call I was not able to reproduce the NaN. Note that I calculated "cvm" correctly only when "A" and "cvm" were of type double*. Can you provide any more information on the REAL type you are using? You might want try the same dgemm call with "A" and "cvm" as double* to see if it’s a type issue and if the NaN persists. Other than that, it would be helpful to know a little more about your environment: what MKL version are you using, are you using threading, what is your system/OS ... etc.&lt;/P&gt;

&lt;P&gt;Anthony&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Anthony,&lt;/P&gt;

&lt;P&gt;I have REAL defined in params.h (source below)&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#ifndef _params_h
#define _params_h

#define REAL double /* precision of data */
#define DBG 0       /* 1 prints out results */
#define READIN 0    /* 1 reads in white paper data, 0 randomizes data */
#define NVAR 128      /* ncols */
#define NITER 25000    /* nrows */
#define RANDMIN -2
#define RANDMAX 2

#endif
&lt;/PRE&gt;

&lt;P&gt;And the matrix A is originally allocated by (in main)&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;REAL *r_data;

...

r_data = (REAL *)mkl_malloc(m_dim * n_dim * sizeof(REAL), 64);

...

memset(r_data,0.0,sizeof(r_data));

... /* setting of r_data values happens in here */

t_data = correlation(r_data, n_dim, m_dim);&lt;/PRE&gt;

&lt;P&gt;and inside of correlation.c is where I get that NaN.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jun 2015 16:46:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050851#M21164</guid>
      <dc:creator>Matt_S_</dc:creator>
      <dc:date>2015-06-24T16:46:28Z</dc:date>
    </item>
    <item>
      <title>Hey Matt,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050852#M21165</link>
      <description>&lt;P&gt;Hey Matt,&lt;/P&gt;

&lt;P&gt;We tracked down the issue: the third parameter of memset() needs to specify the size of the matrix in addition to sizeof() so it should be&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;memset(cvm, 0.0, sizeof(cvm) * n_dim * n_dim);&lt;/PRE&gt;

&lt;P&gt;Anthony&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jun 2015 22:14:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050852#M21165</guid>
      <dc:creator>Anthony_H_Intel1</dc:creator>
      <dc:date>2015-06-24T22:14:20Z</dc:date>
    </item>
    <item>
      <title>Quote:Anthony H. (Intel)</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050853#M21166</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Anthony H. (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;We tracked down the issue: the third parameter of memset() needs to specify the size of the matrix in addition to sizeof() so it should be&lt;/SPAN&gt;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;memset(cvm, 0.0, sizeof(cvm) * n_dim * n_dim);&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;That is still in error, and would cause you problems if you target x32 and x64; sizeof(cvm) and sizeof(*cvm) are not necessarily the same, depending on what REAL is defined to be; note that the count of bytes to set to zero should be computed from the size of each array element and the number of elements in the array, and the size of a pointer does not enter into this calculation.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jun 2015 22:34:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050853#M21166</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2015-06-24T22:34:36Z</dc:date>
    </item>
    <item>
      <title>All,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050854#M21167</link>
      <description>&lt;P&gt;All,&lt;/P&gt;

&lt;P&gt;That cleared it up, however this makes memset SLOW! Doing this is much faster and actually vectorizes according to the compiler (-qopt-report)&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;      for (int i = 0; i &amp;lt; m_dim; i++)
      {
         int indx = i * n_dim;
         for (int j = 0; j &amp;lt; n_dim; j++)
         {
            matrix[indx + j] = 0.0;
         }
      }
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jun 2015 13:04:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Unexpected-NaN-with-clbas-dgemm/m-p/1050854#M21167</guid>
      <dc:creator>Matt_S_</dc:creator>
      <dc:date>2015-06-25T13:04:05Z</dc:date>
    </item>
  </channel>
</rss>

