<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Thank you Yin. It worked in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051323#M21170</link>
    <description>&lt;P&gt;Thank you Yin. It worked excellently. I am documenting all the steps so that anyone refers in the future will have complete details.&lt;/P&gt;

&lt;P&gt;Building and installing armadillo :&amp;nbsp;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;export CXX=icpc&lt;/LI&gt;
	&lt;LI&gt;export CC=icpc&lt;/LI&gt;
	&lt;LI&gt;export PATH=$PATH:/home/ramki/intel/bin:&lt;/LI&gt;
	&lt;LI&gt;Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, include the PATHS correctly.&amp;nbsp;&lt;/LI&gt;
	&lt;LI&gt;Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, change mkl_lp64 to mkl_ilp64&lt;/LI&gt;
	&lt;LI&gt;Edit&amp;nbsp;$armadillo_root/CMakeLists.txt and (1) Change CMAKE_SHARED_LINKER_FLAGS to include the link line by intel link advisor and (2) Change CMAKE_CXX_FLAGS as given by intel link advisor&lt;/LI&gt;
	&lt;LI&gt;Run ./configure and make sure MKL library is used for blas and lapack, icpc to be the compiler and the rest to be alright.&lt;/LI&gt;
	&lt;LI&gt;Run make .&lt;/LI&gt;
	&lt;LI&gt;Verify the linked libraries by running ldd &amp;nbsp;libarmadillo.so. Mainly verify whether it is linked with mkl_ilp64 library and mkl blas and lapack libraries. &amp;nbsp;&lt;/LI&gt;
	&lt;LI&gt;Now run make install DESTDIR=local path.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;In the C++ program&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;Don't type case the const ptr of A.values to ptr A.values using const_cast function. It distorts the pointer and not sure whether we are passing the right pointer to cscmm. Instead duplicate just the values of A alone in a separate array.&lt;/LI&gt;
	&lt;LI&gt;Make sure you are setting the correct ldb and ldc.&lt;/LI&gt;
	&lt;LI&gt;pntrb and pntre could be A.col_ptrs and A.col_ptrs+1.&lt;/LI&gt;
	&lt;LI&gt;Use MKL_INT in the place of long long.&amp;nbsp;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Hope this helps everyone.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 24 Sep 2014 03:09:16 GMT</pubDate>
    <dc:creator>Ramakrishnan_K_</dc:creator>
    <dc:date>2014-09-24T03:09:16Z</dc:date>
    <item>
      <title>CSCMM for Armadillo Sparse dense multiplications</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051321#M21168</link>
      <description>&lt;P style="margin-bottom: 1em; border: 0px; font-size: 14px; vertical-align: baseline; clear: both; color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; line-height: 18px;"&gt;Environment : armadillo 4.320.0 and 4.400&lt;BR /&gt;
	Compiler : Intel CPP compiler&lt;BR /&gt;
	OS : Ubuntu 12.04&lt;/P&gt;

&lt;P style="margin-bottom: 1em; border: 0px; font-size: 14px; vertical-align: baseline; clear: both; color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; line-height: 18px;"&gt;I am trying to replace the Armadillo's native sparse dense multiplication with Intel MKL's CSCMM call. I wrote the following code.&lt;/P&gt;

&lt;PRE style="margin-top: 0px; margin-bottom: 10px; padding: 5px; border: 0px; font-size: 14px; vertical-align: baseline; background-color: rgb(238, 238, 238); font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; overflow: auto; width: auto; max-height: 600px; word-wrap: normal; color: rgb(0, 0, 0); line-height: 18px;"&gt;&lt;CODE style="margin: 0px; padding: 0px; border: 0px; font-size: 14px; vertical-align: baseline; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; white-space: inherit;"&gt;#include &amp;lt;mkl.h&amp;gt;  
#define ARMA_64BIT_WORD
#include &amp;lt;armadillo&amp;gt;

using namespace std;
using namespace arma;

int  main(int argc, char *argv[])
{
   long long m = atoi(argv[1]);
   long long k = atoi(argv[2]);
   long long n = atoi(argv[3]);
   float density = 0.3;
   sp_fmat A = sprandn&amp;lt;sp_fmat&amp;gt;(m,k,density);
   fmat B = randu&amp;lt;fmat&amp;gt;(k,n);
   fmat C(m,n);
   C.zeros();
 //C = alpha * A * B + beta * C;
 //mkl_scscmm (char *transa, MKL_INT *m, MKL_INT *n, MKL_INT *k, float *alpha, char *matdescra,       
 //float *val, MKL_INT *indx, MKL_INT *pntrb, MKL_INT *pntre, float *b, MKL_INT *ldb, float *beta, 
//float *c, MKL_INT *ldc);
  char transa = 'N';
  float alpha = 1.0;
  float beta = 0.0;
  char* matdescra = "GUUC";
  long long ldb = k;
  long long ldc = m;
  cout &amp;lt;&amp;lt; "b4 Input A:" &amp;lt;&amp;lt; endl &amp;lt;&amp;lt; A;
  cout &amp;lt;&amp;lt; "b4 Input B:" &amp;lt;&amp;lt; endl &amp;lt;&amp;lt; B;
  mkl_scscmm (&amp;amp;transa,&amp;amp;m,&amp;amp;n,&amp;amp;k,&amp;amp;alpha,matdescra,
              const_cast&amp;lt;float *&amp;gt;(A.values), (long long *)A.row_indices,
             (long long *)A.col_ptrs,(long long *)(A.col_ptrs + 1),
             B.memptr(),&amp;amp;ldb,
             &amp;amp;beta, C, &amp;amp;ldc);
  cout &amp;lt;&amp;lt; "Input A:" &amp;lt;&amp;lt; endl &amp;lt;&amp;lt; A;
  cout &amp;lt;&amp;lt; "Input B:" &amp;lt;&amp;lt; endl &amp;lt;&amp;lt; B;
  cout &amp;lt;&amp;lt; "Input C:" &amp;lt;&amp;lt; endl &amp;lt;&amp;lt; C;
  return 0;
}&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px;"&gt;I compiled the above code and ran it as "./testcscmm 10 4 6". I am getting a segmentation fault (core dumped).&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;PRE style="margin-top: 0px; margin-bottom: 10px; padding: 5px; border: 0px; font-size: 14px; vertical-align: baseline; background-color: rgb(238, 238, 238); font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; overflow: auto; width: auto; max-height: 600px; word-wrap: normal; color: rgb(0, 0, 0); line-height: 18px;"&gt;&lt;CODE style="margin: 0px; padding: 0px; border: 0px; font-size: 14px; vertical-align: baseline; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; white-space: inherit;"&gt;(0, 0)         1.1123
 (4, 0)        -0.3453
 (8, 0)         0.6081
 (1, 1)         0.6410
 (4, 1)        -0.7121
 (5, 1)         1.1592
 (9, 1)        -1.7189
 (0, 2)         0.4175
 (2, 2)        -0.4001
 (4, 2)         2.2809
 (4, 3)        -2.2717
 (9, 3)         0.2251

b4 Input B:
0.1567   0.9989   0.6126   0.4936   0.5267   0.2833
0.4009   0.2183   0.2960   0.9728   0.7699   0.3525
0.1298   0.5129   0.6376   0.2925   0.4002   0.8077
0.1088   0.8391   0.5243   0.7714   0.8915   0.9190
Input A:
[matrix size: 13715672716573367337x13744746204899078486; n_nonzero: 12; density: 0.00%]

Segmentation fault (core dumped)&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P style="margin-bottom: 1em; border: 0px; font-size: 14px; vertical-align: baseline; clear: both; color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; line-height: 18px;"&gt;For some reason the structure of A is getting corrupted. I have the following questions.&lt;/P&gt;

&lt;OL style="margin-bottom: 1em; margin-left: 30px; border: 0px; font-size: 14px; vertical-align: baseline; list-style-position: initial; list-style-image: initial; color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; line-height: 18px;"&gt;
	&lt;LI style="border: 0px; vertical-align: baseline; background-color: transparent;"&gt;Does MKL_CSCMM modify the input array? If not why should A get corrupted?&lt;/LI&gt;
	&lt;LI style="border: 0px; vertical-align: baseline; background-color: transparent;"&gt;I changed the matrix C to native float. Still the error persists.&lt;/LI&gt;
	&lt;LI style="border: 0px; vertical-align: baseline; background-color: transparent;"&gt;Valgrind shows some memory errors.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P style="margin-bottom: 1em; border: 0px; font-size: 14px; vertical-align: baseline; clear: both; color: rgb(0, 0, 0); font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; line-height: 18px;"&gt;Let me know how to make an intel MKL call using Armadillo's matrix data structures. Especially Sparse dense multiplication.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Sep 2014 02:39:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051321#M21168</guid>
      <dc:creator>Ramakrishnan_K_</dc:creator>
      <dc:date>2014-09-16T02:39:54Z</dc:date>
    </item>
    <item>
      <title>Hi Ramakrishnan K.</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051322#M21169</link>
      <description>&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;Hi&amp;nbsp;&lt;A href="https://software.intel.com/en-us/user/1080401"&gt;Ramakrishnan K.&lt;/A&gt;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;P&gt;&lt;SPAN style="color: rgb(96, 96, 96); font-size: 11px; line-height: 16.5px; background-color: rgb(238, 238, 238);"&gt;The function &amp;nbsp;doesn't not modify the input array.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="color: rgb(96, 96, 96); font-size: 11px; line-height: 16.5px; background-color: rgb(238, 238, 238);"&gt;The problem is in the ldb and ldc, which is columns (in C array, zero-based). &amp;nbsp; You may change them as ldb = n, ldb=c and get the result.&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;ldb : INTEGER. Specifies the leading dimension of b for one-based indexing, and&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;the second dimension of b for zero-based indexing, as declared in the&lt;BR /&gt;
	calling (sub)program&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&amp;nbsp;&lt;/P&gt;

&lt;P&gt;( P.S I have issue to build with -Dmkl_ilp64 &amp;nbsp;as i build libarmadillo.so with default option, it link mkl_lp64 libraries as below, it is conflict with -Dmkl_ilp64&amp;nbsp;&lt;/P&gt;

&lt;P&gt;if i change all of &amp;nbsp;long long to MKL_INT and build with -mkl ( lp64), the program can run fine, please let me know if you get any result)&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;[yhu5@snb01 Debug]$ ldd /home/yhu5/armadillo-4.320.2/libarmadillo.so&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; linux-vdso.so.1 =&amp;gt; &amp;nbsp;(0x00007fffef75b000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libmkl_intel_thread.so =&amp;gt; /opt/intel/mkl/lib/intel64/libmkl_intel_thread.so (0x00007ffc58a14000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libmkl_core.so =&amp;gt; /opt/intel/mkl/lib/intel64/libmkl_core.so (0x00007ffc57343000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libmkl_intel_lp64.so =&amp;gt; /opt/intel/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007ffc56a3b000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libstdc++.so.6 =&amp;gt; /usr/lib64/libstdc++.so.6 (0x00007ffc5671f000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libm.so.6 =&amp;gt; /lib64/libm.so.6 (0x00007ffc5649a000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libgcc_s.so.1 =&amp;gt; /lib64/libgcc_s.so.1 (0x00007ffc56284000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libc.so.6 =&amp;gt; /lib64/libc.so.6 (0x00007ffc55f05000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; libdl.so.2 =&amp;gt; /lib64/libdl.so.2 (0x00007ffc55d00000)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; /lib64/ld-linux-x86-64.so.2 (0x0000003b8ea00000)&lt;/P&gt;

&lt;P&gt;icc &amp;nbsp;-I/home/yhu5/armadillo-4.320.2/include -L/home/yhu5/armadillo-4.320.2 -o "testprograms_LP64" &amp;nbsp;../cscmmtest.cpp &amp;nbsp;-larmadillo &amp;nbsp;-mkl -g -O0 -g3 -Wall -Wextra -pedantic&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;[yhu5@snb01 Debug]$ ./testprograms_LP64 3 2 1&lt;BR /&gt;
	b4 Input A:&lt;BR /&gt;
	[matrix size: 3x2; n_nonzero: 2; density: 33.33%]&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(0, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.1123&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(2, 1) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;-0.3453&lt;/P&gt;

&lt;P&gt;b4 Input B:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984&lt;/P&gt;

&lt;P&gt;Input A:&lt;BR /&gt;
	[matrix size: 3x2; n_nonzero: 2; density: 33.33%]&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(0, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.1123&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(2, 1) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;-0.3453&lt;/P&gt;

&lt;P&gt;Input B:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984&lt;BR /&gt;
	Output C:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8710&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;BR /&gt;
	&amp;nbsp; -0.2757&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 Sep 2014 08:25:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051322#M21169</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-09-22T08:25:00Z</dc:date>
    </item>
    <item>
      <title>Thank you Yin. It worked</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051323#M21170</link>
      <description>&lt;P&gt;Thank you Yin. It worked excellently. I am documenting all the steps so that anyone refers in the future will have complete details.&lt;/P&gt;

&lt;P&gt;Building and installing armadillo :&amp;nbsp;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;export CXX=icpc&lt;/LI&gt;
	&lt;LI&gt;export CC=icpc&lt;/LI&gt;
	&lt;LI&gt;export PATH=$PATH:/home/ramki/intel/bin:&lt;/LI&gt;
	&lt;LI&gt;Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, include the PATHS correctly.&amp;nbsp;&lt;/LI&gt;
	&lt;LI&gt;Edit $armadillo_root/cmake_aux/Modules/ARMA_FindMKL.cmake, change mkl_lp64 to mkl_ilp64&lt;/LI&gt;
	&lt;LI&gt;Edit&amp;nbsp;$armadillo_root/CMakeLists.txt and (1) Change CMAKE_SHARED_LINKER_FLAGS to include the link line by intel link advisor and (2) Change CMAKE_CXX_FLAGS as given by intel link advisor&lt;/LI&gt;
	&lt;LI&gt;Run ./configure and make sure MKL library is used for blas and lapack, icpc to be the compiler and the rest to be alright.&lt;/LI&gt;
	&lt;LI&gt;Run make .&lt;/LI&gt;
	&lt;LI&gt;Verify the linked libraries by running ldd &amp;nbsp;libarmadillo.so. Mainly verify whether it is linked with mkl_ilp64 library and mkl blas and lapack libraries. &amp;nbsp;&lt;/LI&gt;
	&lt;LI&gt;Now run make install DESTDIR=local path.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;In the C++ program&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;Don't type case the const ptr of A.values to ptr A.values using const_cast function. It distorts the pointer and not sure whether we are passing the right pointer to cscmm. Instead duplicate just the values of A alone in a separate array.&lt;/LI&gt;
	&lt;LI&gt;Make sure you are setting the correct ldb and ldc.&lt;/LI&gt;
	&lt;LI&gt;pntrb and pntre could be A.col_ptrs and A.col_ptrs+1.&lt;/LI&gt;
	&lt;LI&gt;Use MKL_INT in the place of long long.&amp;nbsp;&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Hope this helps everyone.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Sep 2014 03:09:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051323#M21170</guid>
      <dc:creator>Ramakrishnan_K_</dc:creator>
      <dc:date>2014-09-24T03:09:16Z</dc:date>
    </item>
    <item>
      <title> </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051324#M21171</link>
      <description>&lt;DIV class="forum-post-author" style="padding: 2px 0px 2px 2px; display: inline-block; color: rgb(153, 153, 153); font-size: 11px; line-height: 16.5px;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;P&gt;&lt;SPAN style="color: rgb(96, 96, 96); font-size: 11px; line-height: 16.5px; background-color: rgb(238, 238, 238);"&gt;&amp;nbsp;Hi Ramkrishnan,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Thanks you very much for the details and tech sharing. &amp;nbsp;I mark your answer as best reply. &amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Sep 2014 02:20:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051324#M21171</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-09-25T02:20:23Z</dc:date>
    </item>
    <item>
      <title>Hi Ying,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051325#M21172</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Hi Ying,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I tested my algorithm with the cscmm and my algorithm fails because there appears to be something wrong with the mkl_cscmm call. I load a sparse identify matrix (small eye.mm) from the file and multiply with a random matrix. I am getting the wrong output.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;For your reference, I am attaching the source code along with the sample file. Kindly let me know where is the problem. I changed the ldb and ldc with both the number of rows(m) and number of columns (.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;./testprograms smalleye.mm 7&lt;BR /&gt;
	LoadMatrixMarketFile for file=smalleye.mm&lt;BR /&gt;
	mm file height=5 width=5 nnz=5&lt;BR /&gt;
	start loading the mm file&lt;BR /&gt;
	location=2x5 VAL=5x1&lt;BR /&gt;
	completed reading the file&lt;BR /&gt;
	CNorm=3.74208&lt;BR /&gt;
	DNorm=3.45878&lt;BR /&gt;
	diffs=35x1&lt;BR /&gt;
	Input A:&lt;BR /&gt;
	[matrix size: 5x5; n_nonzero: 5; density: 20.00%]&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(0, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(1, 1) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(2, 2) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(3, 3) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(4, 4) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;/P&gt;

&lt;P&gt;Input B:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8402 &amp;nbsp; 0.1976 &amp;nbsp; 0.4774 &amp;nbsp; 0.9162 &amp;nbsp; 0.0163 &amp;nbsp; 0.4009 &amp;nbsp; 0.5129&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944 &amp;nbsp; 0.3352 &amp;nbsp; 0.6289 &amp;nbsp; 0.6357 &amp;nbsp; 0.2429 &amp;nbsp; 0.1298 &amp;nbsp; 0.8391&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 &amp;nbsp; 0.7682 &amp;nbsp; 0.3648 &amp;nbsp; 0.7173 &amp;nbsp; 0.1372 &amp;nbsp; 0.1088 &amp;nbsp; 0.6126&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.2778 &amp;nbsp; 0.5134 &amp;nbsp; 0.1416 &amp;nbsp; 0.8042 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.5540 &amp;nbsp; 0.9522 &amp;nbsp; 0.6070 &amp;nbsp; 0.1567 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376&lt;BR /&gt;
	output C:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;1.7924 &amp;nbsp; 0.8045 &amp;nbsp; 0.8391 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;1.3106 &amp;nbsp; 0.3515 &amp;nbsp; 0.6126 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;1.4188 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;1.5157 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;1.0532 &amp;nbsp; 0.5129 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	ArmaD:&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8402 &amp;nbsp; 0.1976 &amp;nbsp; 0.4774 &amp;nbsp; 0.9162 &amp;nbsp; 0.0163 &amp;nbsp; 0.4009 &amp;nbsp; 0.5129&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944 &amp;nbsp; 0.3352 &amp;nbsp; 0.6289 &amp;nbsp; 0.6357 &amp;nbsp; 0.2429 &amp;nbsp; 0.1298 &amp;nbsp; 0.8391&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 &amp;nbsp; 0.7682 &amp;nbsp; 0.3648 &amp;nbsp; 0.7173 &amp;nbsp; 0.1372 &amp;nbsp; 0.1088 &amp;nbsp; 0.6126&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.2778 &amp;nbsp; 0.5134 &amp;nbsp; 0.1416 &amp;nbsp; 0.8042 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.5540 &amp;nbsp; 0.9522 &amp;nbsp; 0.6070 &amp;nbsp; 0.1567 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376&lt;/P&gt;

&lt;P&gt;Ramki&lt;/P&gt;</description>
      <pubDate>Sun, 07 Dec 2014 16:42:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051325#M21172</guid>
      <dc:creator>Ramakrishnan_K_</dc:creator>
      <dc:date>2014-12-07T16:42:06Z</dc:date>
    </item>
    <item>
      <title>Hi Ramakrishnan K.</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051326#M21173</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;A href="https://software.intel.com/en-us/user/1080401" style="font-size: 12px; line-height: 18px;"&gt;Ramakrishnan K.&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;I tried your code with lp64 &amp;nbsp;armadillo ( &amp;nbsp;the build &amp;nbsp;last time)&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I comment out &amp;nbsp;the line //#define ARMA_64BIT_WORD in utils.h , then build with the below command. &amp;nbsp;Everything looks fine. &amp;nbsp;I can get right result.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I'm not sure if there is ilp64 problem, i will check it with C code. &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Or there is some implicit problem with long long operation in your code. you may check it in details.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;for example, try the pionter &amp;nbsp;(MKL_INT *)(A.col_ptrs + 1) &amp;nbsp;to (&lt;/SPAN&gt;&lt;SPAN style="line-height: 19.5120010375977px;"&gt;(MKL_INT *)(A.col_ptrs) + 1))&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="line-height: 19.5120010375977px;"&gt;Best Regards,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="line-height: 19.5120010375977px;"&gt;Ying&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;[yhu5@snb01 cscmmtest]$ icc &amp;nbsp;-I/home/yhu5/armadillo-4.320.2/include -I. -L/home/yhu5/armadillo-4.320.2 -o "cscmmtest" matrix_market_file.cpp cscmmtest.cpp &amp;nbsp;-larmadillo &amp;nbsp; -g -O0 -g3 -Wall -Wextra -pedantic -openmp&lt;BR /&gt;
	[yhu5@snb01 cscmmtest]$ ./cscmmtest smalleye.mm 7&lt;BR /&gt;
	LoadMatrixMarketFile for file=smalleye.mm&lt;BR /&gt;
	mm file height=5 width=5 nnz=5&lt;BR /&gt;
	start loading the mm file&lt;BR /&gt;
	location=2x5 VAL=5x1&lt;BR /&gt;
	completed reading the file&lt;BR /&gt;
	Avaldup&lt;BR /&gt;
	Avaldup&lt;BR /&gt;
	Avaldup&lt;BR /&gt;
	111Avaldup&lt;BR /&gt;
	1Avaldup&lt;BR /&gt;
	1CNorm=3.45878&lt;BR /&gt;
	DNorm=3.45878&lt;BR /&gt;
	diffs=0x1&lt;BR /&gt;
	Input A:&lt;BR /&gt;
	[matrix size: 5x5; n_nonzero: 5; density: 20.00%]&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(0, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(1, 1) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(2, 2) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(3, 3) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(4, 4) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1.0000&lt;/P&gt;

&lt;P&gt;Input B:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8402 &amp;nbsp; 0.1976 &amp;nbsp; 0.4774 &amp;nbsp; 0.9162 &amp;nbsp; 0.0163 &amp;nbsp; 0.4009 &amp;nbsp; 0.5129&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944 &amp;nbsp; 0.3352 &amp;nbsp; 0.6289 &amp;nbsp; 0.6357 &amp;nbsp; 0.2429 &amp;nbsp; 0.1298 &amp;nbsp; 0.8391&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 &amp;nbsp; 0.7682 &amp;nbsp; 0.3648 &amp;nbsp; 0.7173 &amp;nbsp; 0.1372 &amp;nbsp; 0.1088 &amp;nbsp; 0.6126&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.2778 &amp;nbsp; 0.5134 &amp;nbsp; 0.1416 &amp;nbsp; 0.8042 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.5540 &amp;nbsp; 0.9522 &amp;nbsp; 0.6070 &amp;nbsp; 0.1567 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376&lt;BR /&gt;
	output C:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8402 &amp;nbsp; 0.1976 &amp;nbsp; 0.4774 &amp;nbsp; 0.9162 &amp;nbsp; 0.0163 &amp;nbsp; 0.4009 &amp;nbsp; 0.5129&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944 &amp;nbsp; 0.3352 &amp;nbsp; 0.6289 &amp;nbsp; 0.6357 &amp;nbsp; 0.2429 &amp;nbsp; 0.1298 &amp;nbsp; 0.8391&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 &amp;nbsp; 0.7682 &amp;nbsp; 0.3648 &amp;nbsp; 0.7173 &amp;nbsp; 0.1372 &amp;nbsp; 0.1088 &amp;nbsp; 0.6126&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.2778 &amp;nbsp; 0.5134 &amp;nbsp; 0.1416 &amp;nbsp; 0.8042 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.5540 &amp;nbsp; 0.9522 &amp;nbsp; 0.6070 &amp;nbsp; 0.1567 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376&lt;BR /&gt;
	ArmaD:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.8402 &amp;nbsp; 0.1976 &amp;nbsp; 0.4774 &amp;nbsp; 0.9162 &amp;nbsp; 0.0163 &amp;nbsp; 0.4009 &amp;nbsp; 0.5129&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944 &amp;nbsp; 0.3352 &amp;nbsp; 0.6289 &amp;nbsp; 0.6357 &amp;nbsp; 0.2429 &amp;nbsp; 0.1298 &amp;nbsp; 0.8391&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 &amp;nbsp; 0.7682 &amp;nbsp; 0.3648 &amp;nbsp; 0.7173 &amp;nbsp; 0.1372 &amp;nbsp; 0.1088 &amp;nbsp; 0.6126&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.2778 &amp;nbsp; 0.5134 &amp;nbsp; 0.1416 &amp;nbsp; 0.8042 &amp;nbsp; 0.9989 &amp;nbsp; 0.2960&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.5540 &amp;nbsp; 0.9522 &amp;nbsp; 0.6070 &amp;nbsp; 0.1567 &amp;nbsp; 0.2183 &amp;nbsp; 0.6376&lt;/P&gt;</description>
      <pubDate>Tue, 09 Dec 2014 03:06:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051326#M21173</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-12-09T03:06:12Z</dc:date>
    </item>
    <item>
      <title>And i check the common c code</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051327#M21174</link>
      <description>&lt;P&gt;And i check the common c code. the ILP 64 bit works too. So the problem is in the transaction. Please check the pointer one by one.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;

&lt;P&gt;build it in VS studio 2010, /D "MKL_ILP64" &amp;nbsp;link with mkl_intel_ilp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib.&amp;nbsp;&lt;BR /&gt;
	#include &amp;lt;stdio.h&amp;gt;&lt;BR /&gt;
	#include "mkl_types.h"&lt;BR /&gt;
	#include "mkl_spblas.h"&lt;/P&gt;

&lt;P&gt;int main () {&lt;BR /&gt;
	//*******************************************************************************&lt;BR /&gt;
	// &amp;nbsp; &amp;nbsp; Definition arrays for sparse representation of &amp;nbsp;the matrix A in&amp;nbsp;&lt;BR /&gt;
	// &amp;nbsp; &amp;nbsp; the compressed sparse column format:&amp;nbsp;&lt;BR /&gt;
	//*******************************************************************************&amp;nbsp;&lt;BR /&gt;
	#define M 5 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	#define NNZ 5 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	#define MNEW 3 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;m = M, nnz = NNZ, mnew = MNEW;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;values[NNZ]&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;= {1.0, 1.0, 1.0, 1.0, 1.0};&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;rows[NNZ]&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;= {0, 1, 2, &amp;nbsp;3, 4};&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;colIndex[M+1] = {0, 1, &amp;nbsp;2, &amp;nbsp;3, 4, 5};&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;pointerB&lt;M&gt; , pointerE&lt;M&gt;;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;i, j, k;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	//*******************************************************************************&lt;BR /&gt;
	// &amp;nbsp; &amp;nbsp;Declaration of local variables :&amp;nbsp;&lt;BR /&gt;
	//*******************************************************************************&lt;BR /&gt;
	#define N 7 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; MKL_INT&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;n = N;&lt;BR /&gt;
	&amp;nbsp; float&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;sol&lt;M&gt;&lt;N&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;= &amp;nbsp;&amp;nbsp; &amp;nbsp;{ 0.8402, &amp;nbsp; 0.1976 , &amp;nbsp;0.4774 , &amp;nbsp;0.9162 , &amp;nbsp;0.0163, &amp;nbsp; 0.4009 , &amp;nbsp;0.5129,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3944, &amp;nbsp; 0.3352, &amp;nbsp; 0.6289, &amp;nbsp; 0.6357, &amp;nbsp; 0.2429, &amp;nbsp; 0.1298 , &amp;nbsp;0.8391,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7831 , &amp;nbsp;0.7682, &amp;nbsp; 0.3648, &amp;nbsp; 0.7173, &amp;nbsp; 0.1372 , &amp;nbsp;0.1088, &amp;nbsp; 0.6126,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984, &amp;nbsp; 0.2778, &amp;nbsp; 0.5134 , &amp;nbsp;0.1416 , &amp;nbsp;0.8042, &amp;nbsp; 0.9989 &amp;nbsp;, 0.2960,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116, &amp;nbsp; 0.5540 &amp;nbsp;, 0.9522 , &amp;nbsp;0.6070 , &amp;nbsp;0.1567 , &amp;nbsp;0.2183 &amp;nbsp;, 0.6376};&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;float&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;rhs&lt;M&gt;&lt;N&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;= {0.0};&lt;/N&gt;&lt;/M&gt;&lt;/N&gt;&lt;/M&gt;&lt;/M&gt;&lt;/M&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;char transa = 'N';&lt;BR /&gt;
	&amp;nbsp; float alpha = 1.0;&lt;BR /&gt;
	&amp;nbsp; float beta = 0.0;&lt;BR /&gt;
	&amp;nbsp; char* matdescra = "GUNC";&lt;BR /&gt;
	&amp;nbsp; MKL_INT ldb = n;&lt;BR /&gt;
	&amp;nbsp; MKL_INT ldc = n;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("\n EXAMPLE PROGRAM FOR COMPRESSED SPARSE ROW FORMAT ROUTINES \n");&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	//*******************************************************************************&lt;BR /&gt;
	//Task 1. &amp;nbsp; &amp;nbsp;Obtain matrix-matrix multiply (L+D)' *sol --&amp;gt; rhs&lt;BR /&gt;
	// &amp;nbsp; &amp;nbsp;and solve triangular system &amp;nbsp; (L+D)' *temp = rhs with multiple right hand sides&lt;BR /&gt;
	// &amp;nbsp; &amp;nbsp;Array temp must be equal to the array sol&amp;nbsp;&lt;BR /&gt;
	//*******************************************************************************&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; \n");&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; INPUT DATA FOR MKL_SCSCMM \n");&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; WITH TRIANGULAR MATRIX &amp;nbsp; &amp;nbsp;\n");&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; &amp;nbsp; M = %1.1i &amp;nbsp; N = %1.1i\n", m, n);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; &amp;nbsp; ALPHA = %4.1f &amp;nbsp;BETA = %4.1f \n", alpha, beta);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; &amp;nbsp; TRANS = '%c' \n", 'T');&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; Input matrix &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;\n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (i = 0; i &amp;lt; m; i++) {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (j = 0; j &amp;lt; n; j++) {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("%7.3f", sol&lt;I&gt;&lt;J&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;};&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("\n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;};&lt;/J&gt;&lt;/I&gt;&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	MKL_INT * colT = colIndex+1;&lt;/P&gt;

&lt;P&gt;mkl_scscmm(&amp;amp;transa, &amp;amp;m, &amp;amp;n, &amp;amp;m, &amp;amp;alpha, matdescra, values, rows, colIndex, colT, &amp;amp;(sol[0][0]), &amp;amp;n, &amp;nbsp;&amp;amp;beta, &amp;amp;(rhs[0][0]), &amp;amp;n);&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf(" &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; \n");&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; OUTPUT DATA FOR MKL_SCSCMM\n");&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; printf(" &amp;nbsp; WITH TRIANGULAR MATRIX &amp;nbsp; &amp;nbsp;\n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (i = 0; i &amp;lt; m; i++) {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;for (j = 0; j &amp;lt; n; j++) {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("%7.3f", rhs&lt;I&gt;&lt;J&gt;);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;};&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;printf("\n");&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;};&lt;/J&gt;&lt;/I&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp; fflush(stdout);&lt;BR /&gt;
	&amp;nbsp;// sleep(2);&lt;BR /&gt;
	&amp;nbsp; //C.save("out.txt",arma_ascii);&lt;BR /&gt;
	&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; return 0;&lt;BR /&gt;
	}&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;EXAMPLE PROGRAM FOR COMPRESSED SPARSE ROW FORMAT R&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;INPUT DATA FOR MKL_SCSCMM&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;WITH TRIANGULAR MATRIX&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;M = 5 &amp;nbsp; N = 7&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;ALPHA = &amp;nbsp;1.0 &amp;nbsp;BETA = &amp;nbsp;0.0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;TRANS = 'T'&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;Input matrix&lt;BR /&gt;
	&amp;nbsp; 0.840 &amp;nbsp;0.198 &amp;nbsp;0.477 &amp;nbsp;0.916 &amp;nbsp;0.016 &amp;nbsp;0.401 &amp;nbsp;0.513&lt;BR /&gt;
	&amp;nbsp; 0.394 &amp;nbsp;0.335 &amp;nbsp;0.629 &amp;nbsp;0.636 &amp;nbsp;0.243 &amp;nbsp;0.130 &amp;nbsp;0.839&lt;BR /&gt;
	&amp;nbsp; 0.783 &amp;nbsp;0.768 &amp;nbsp;0.365 &amp;nbsp;0.717 &amp;nbsp;0.137 &amp;nbsp;0.109 &amp;nbsp;0.613&lt;BR /&gt;
	&amp;nbsp; 0.798 &amp;nbsp;0.278 &amp;nbsp;0.513 &amp;nbsp;0.142 &amp;nbsp;0.804 &amp;nbsp;0.999 &amp;nbsp;0.296&lt;BR /&gt;
	&amp;nbsp; 0.912 &amp;nbsp;0.554 &amp;nbsp;0.952 &amp;nbsp;0.607 &amp;nbsp;0.157 &amp;nbsp;0.218 &amp;nbsp;0.638&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;OUTPUT DATA FOR MKL_SCSCMM&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;WITH TRIANGULAR MATRIX&lt;BR /&gt;
	&amp;nbsp; 0.840 &amp;nbsp;0.198 &amp;nbsp;0.477 &amp;nbsp;0.916 &amp;nbsp;0.016 &amp;nbsp;0.401 &amp;nbsp;0.513&lt;BR /&gt;
	&amp;nbsp; 0.394 &amp;nbsp;0.335 &amp;nbsp;0.629 &amp;nbsp;0.636 &amp;nbsp;0.243 &amp;nbsp;0.130 &amp;nbsp;0.839&lt;BR /&gt;
	&amp;nbsp; 0.783 &amp;nbsp;0.768 &amp;nbsp;0.365 &amp;nbsp;0.717 &amp;nbsp;0.137 &amp;nbsp;0.109 &amp;nbsp;0.613&lt;BR /&gt;
	&amp;nbsp; 0.798 &amp;nbsp;0.278 &amp;nbsp;0.513 &amp;nbsp;0.142 &amp;nbsp;0.804 &amp;nbsp;0.999 &amp;nbsp;0.296&lt;BR /&gt;
	&amp;nbsp; 0.912 &amp;nbsp;0.554 &amp;nbsp;0.952 &amp;nbsp;0.607 &amp;nbsp;0.157 &amp;nbsp;0.218 &amp;nbsp;0.638&lt;/P&gt;</description>
      <pubDate>Tue, 09 Dec 2014 03:45:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051327#M21174</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-12-09T03:45:42Z</dc:date>
    </item>
    <item>
      <title>Hi Ying,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051328#M21175</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Hi Ying,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Very interesting observation. You are right. If I disable&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;ARMA_64BIT_WORD and remove -DMKL_ILP64 while compiling everything works fine. It does not work good with both enabled. Armadillo for sparse representation stores row_indices, pntrb and pntre as unsigned long long * and MKL needs const long long *. So type casting should work and for some reason it is failing.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;I will investigate further and let you know.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;Ramki&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Dec 2014 16:21:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051328#M21175</guid>
      <dc:creator>Ramakrishnan_K_</dc:creator>
      <dc:date>2014-12-09T16:21:56Z</dc:date>
    </item>
    <item>
      <title>Hi Ying,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051329#M21176</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Hi Ying,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;There is one problem in using armadillo's matrix for mkl. Armadillo is using column major ordering. However, MKL is expecting a row major order. But even after fixing this problem, while enabling MKL_ILP64 there is some problem. I am providing with two source files along with this post. You can replace these files in the tar files attached in the previous message. The cscmmtest.cpp, compares the armadillo's sparse multiplication with the mkl_scscmm implementation. I &lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;modified the input matrix&amp;nbsp;of your code&amp;nbsp;posted in your message 12/8 and name this as testmklcscmm.cpp. I could not upload a&amp;nbsp;&lt;/SPAN&gt;hop&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;file. Hence rename the attached MKLSparseRepresentation.h as MKLSparseRepresentation.hpp before compiling. Disable -DMKL_ILP64 and&amp;nbsp;every thing works fine. Enable MKL_ILP64 and find some problem with mkl_scscmm. Interesting the observation on the mkl_scscmm output is that, some rows matches with the armadillo multiplication, one row will be the sum of the rows and rest will be zero. For eg., in the case of a 3x3 output matrix, 2nd row will be correct output, the sum of rows 1&amp;amp;2 appears as row1 and third row of zeros.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;MKL output C:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9857 &amp;nbsp; 0.4139 &amp;nbsp; 0.6839 &amp;lt;----- sum of rows 1&amp;amp;2&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6253 &amp;nbsp; 0.2625 &amp;nbsp; 0.4338 &amp;lt;------ correct row 3.&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;BR /&gt;
	ArmaD:&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6708 &amp;nbsp; 0.2817 &amp;nbsp; 0.4654&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3149 &amp;nbsp; 0.1322 &amp;nbsp; 0.2185&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6253 &amp;nbsp; 0.2625 &amp;nbsp; 0.4338&lt;/P&gt;

&lt;P&gt;Please let us know where we are making the mistake. Appreciate your response.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Ramki&lt;/P&gt;</description>
      <pubDate>Wed, 17 Dec 2014 01:58:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051329#M21176</guid>
      <dc:creator>Ramakrishnan_K_</dc:creator>
      <dc:date>2014-12-17T01:58:42Z</dc:date>
    </item>
    <item>
      <title>Hi Ramki, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051330#M21177</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;Ramki,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;I rebuild the&amp;nbsp;Armadillo with mkl ilp64 bit. &amp;nbsp;The processing is&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Edit $armadillo_root/&lt;STRONG&gt;build_aux/cmake&lt;/STRONG&gt;/Modules/ARMA_FindMKL.cmake, change mkl_intel_lp64 to mkl_intel_ilp64&lt;/P&gt;

&lt;P&gt;$&amp;gt;source /opt/intel/composer_xe_2013.5.192/bin/compilervars.sh intel64&lt;/P&gt;

&lt;P&gt;&amp;gt;./configure&lt;/P&gt;

&lt;P&gt;(it shows &amp;nbsp;FOUND MKL, &amp;nbsp;Compiler = GNU)&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;gt;./make&lt;/P&gt;

&lt;P&gt;Then i build the test code you attached. &amp;nbsp; As you see as below, everything runs fine.&lt;/P&gt;

&lt;P&gt;So mkl lp64 library work with lp 64 code. &amp;nbsp;and mkl ilp64 library work with ilp64 (&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;ARMA_64BIT_WORD)&lt;/SPAN&gt; code. &amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;If mix them, then i get core dumping. &amp;nbsp;For most of users, the default setting build (lp64) should be work without problem.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Here is the test process. &amp;nbsp;I also attach the whole code and exe file for your reference.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;vi utils.h &amp;nbsp;(change back the line &amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;#define ARMA_64BIT_WORD in utils.h )&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;gt;icc &amp;nbsp;-I/home/yhu5/armd_ilp64/armadillo-4.320.2/include -L/home/yhu5/armd_ilp64/armadillo-4.320.2 -o "mklprograms_iLP64" &amp;nbsp;testmklscscmm.cpp &amp;nbsp;-larmadillo &amp;nbsp;-g -O0 -g3 -Wall -Wextra -pedantic -DMKL_ILP64&lt;/P&gt;

&lt;P&gt;vi cscmmtest.cpp (change&amp;nbsp;int &amp;nbsp;cscmmtest(int argc, char *argv[]), to&amp;nbsp;&lt;SPAN style="line-height: 19.5120010375977px;"&gt;int &amp;nbsp;main(int argc, char *argv[]) ( for easy compiler)&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;gt;icc &amp;nbsp;-I/home/yhu5/armd_ilp64/armadillo-4.320.2/include -L/home/yhu5/armd_ilp64/armadillo-4.320.2 -o "Aprograms_iLP64" &amp;nbsp;cscmmtest.cpp matrix_market_file.cpp &amp;nbsp;-larmadillo &amp;nbsp;-g -O0 -g3 -Wall -Wextra -pedantic -DMKL_ILP64&lt;/P&gt;

&lt;P&gt;&amp;gt;&lt;/P&gt;

&lt;P&gt;[yhu5@snb01 armd_ilp64]$ ./mklprograms_iLP64&lt;/P&gt;

&lt;P&gt;&amp;nbsp;EXAMPLE PROGRAM FOR COMPRESSED SPARSE ROW FORMAT ROUTINES&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;INPUT DATA FOR MKL_SCSCMM&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;WITH TRIANGULAR MATRIX&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;M = 3 &amp;nbsp; N = 3&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;ALPHA = &amp;nbsp;1.0 &amp;nbsp;BETA = &amp;nbsp;0.0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;TRANS = 'T'&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;Input matrix&lt;BR /&gt;
	&amp;nbsp; 0.798 &amp;nbsp;0.335 &amp;nbsp;0.554&lt;BR /&gt;
	&amp;nbsp; 0.912 &amp;nbsp;0.768 &amp;nbsp;0.477&lt;BR /&gt;
	&amp;nbsp; 0.198 &amp;nbsp;0.278 &amp;nbsp;0.629&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp;OUTPUT DATA FOR MKL_SCSCMM&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;WITH TRIANGULAR MATRIX&lt;BR /&gt;
	&amp;nbsp; 0.671 &amp;nbsp;0.282 &amp;nbsp;0.465&lt;BR /&gt;
	&amp;nbsp; 0.315 &amp;nbsp;0.132 &amp;nbsp;0.218&lt;BR /&gt;
	&amp;nbsp; 0.625 &amp;nbsp;0.263 &amp;nbsp;0.434&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;[yhu5@snb01 armd_ilp64]$ ./Aprograms_iLP64 3 3 3&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;MKL Time : 0.00273395&lt;BR /&gt;
	CNorm=1.24836&lt;BR /&gt;
	Arma time:5.96046e-06&lt;BR /&gt;
	DNorm=1.24836&lt;BR /&gt;
	diffs=0x1&lt;BR /&gt;
	Input A:&lt;BR /&gt;
	[matrix size: 3x3; n_nonzero: 3; density: 33.33%]&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(0, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.8402&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(1, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.3944&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp;(2, 0) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0.7831&lt;/P&gt;

&lt;P&gt;Input B:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.7984 &amp;nbsp; 0.3352 &amp;nbsp; 0.5540&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.9116 &amp;nbsp; 0.7682 &amp;nbsp; 0.4774&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.1976 &amp;nbsp; 0.2778 &amp;nbsp; 0.6289&lt;BR /&gt;
	output C:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6708 &amp;nbsp; 0.2817 &amp;nbsp; 0.4654&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3149 &amp;nbsp; 0.1322 &amp;nbsp; 0.2185&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6253 &amp;nbsp; 0.2625 &amp;nbsp; 0.4338&lt;BR /&gt;
	ArmaD:&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6708 &amp;nbsp; 0.2817 &amp;nbsp; 0.4654&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.3149 &amp;nbsp; 0.1322 &amp;nbsp; 0.2185&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;0.6253 &amp;nbsp; 0.2625 &amp;nbsp; 0.4338&lt;BR /&gt;
	Recover A=C*inv(B)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;8.4019e-01 &amp;nbsp; 2.9802e-08 &amp;nbsp;-1.7881e-07&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;3.9438e-01 &amp;nbsp; 1.4901e-08 &amp;nbsp;-8.9407e-08&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;7.8310e-01 &amp;nbsp; 5.9605e-08 &amp;nbsp;-1.7881e-07&lt;/P&gt;

&lt;P&gt;Recover A=armaD*inv(B)&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;8.4019e-01 &amp;nbsp; 2.9802e-08 &amp;nbsp;-1.7881e-07&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;3.9438e-01 &amp;nbsp; 1.4901e-08 &amp;nbsp;-8.9407e-08&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;7.8310e-01 &amp;nbsp; 5.9605e-08 &amp;nbsp;-1.7881e-07&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 Dec 2014 03:15:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051330#M21177</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-12-22T03:15:34Z</dc:date>
    </item>
    <item>
      <title>Attached the code and binary</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051331#M21178</link>
      <description>&lt;P&gt;Attached the code and binary file&lt;/P&gt;</description>
      <pubDate>Mon, 22 Dec 2014 03:19:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/CSCMM-for-Armadillo-Sparse-dense-multiplications/m-p/1051331#M21178</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2014-12-22T03:19:58Z</dc:date>
    </item>
  </channel>
</rss>

