<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Here are the constructors and in Software Archive</title>
    <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037405#M44532</link>
    <description>&lt;P&gt;Here are the constructors and some test code I have. I compiled using&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;icpc -g -mmic -o Test Test.cpp Matrix.cpp&lt;/PRE&gt;

&lt;P&gt;Let me know how it goes for you.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 11 Jun 2014 14:19:22 GMT</pubDate>
    <dc:creator>Keegan_S_</dc:creator>
    <dc:date>2014-06-11T14:19:22Z</dc:date>
    <item>
      <title>Seg. fault when using _m512_mul_pd</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037395#M44522</link>
      <description>&lt;P&gt;I'm an undergrad student working on some code for that is supposed to just do some simple matrix multiplication. The professor I'm working for has purchased a MIC and I'm trying to get things working using some of the C++ intrinsics. In particular I'm using _m512_mul_pd to try to multiply two&amp;nbsp; together and I'm storing the result in another vector. However, when ever I have any code that accesses the variable I use to store the result of the multiplication in I get a seg. fault. Any ideas of why this is happening and what I can do to fix it?&lt;/P&gt;

&lt;P&gt;Here are the lines of code I'm talking about:&lt;/P&gt;

&lt;P&gt;__m512d tmp = _m512_mul_pd(matrix[0].v, m.matrix[0].v)&amp;nbsp; //works fine&lt;/P&gt;

&lt;P&gt;return Matrix(tmp) // causes seg fault&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 12:58:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037395#M44522</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-10T12:58:39Z</dc:date>
    </item>
    <item>
      <title>is your data aligned on 64</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037396#M44523</link>
      <description>&lt;P&gt;is your data aligned on 64 byte boundary?&lt;/P&gt;

&lt;P&gt;I think you should post more of your code. It is unclear for me what &lt;SPAN style="line-height: 18px;"&gt;"return Matrix(tmp)" does.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 16:21:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037396#M44523</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-10T16:21:25Z</dc:date>
    </item>
    <item>
      <title>In the header file I have a</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037397#M44524</link>
      <description>&lt;P&gt;In the header file I have a union defined as follows which I then use to hold the matrix data such that its 64 byte aligned.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;union dvec{
     __m512d v;
     double d[8];
};&lt;/PRE&gt;

&lt;PRE class="brush:cpp;"&gt;union dvec matrix[5] __attribute__((align(64)));&lt;/PRE&gt;

&lt;P&gt;I then have a constructor that takes doubles and another that takes a __m512d data type to create a Matrix object, thus what Matrix(tmp) should be doing in my first post.&lt;/P&gt;

&lt;P&gt;In the function I'm having the seg. fault in I take a single matrix object as a parameter and then then try to multiply the two together.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;Matrix Matrix::Multiply(const Matrix&amp;amp; m){
     union dvec tmp __attribute__((align(64)));
     tmp = _mm512_setzero_pd();
     tmp = _mm512_mul_pd(matrix[0].v, m.matrix[0].v);
     return Matrix(tmp);
}&lt;/PRE&gt;

&lt;P&gt;I realize that as I have it written now it doesn't actually multiply the matrices together by the standard definition of matrix multiplication. This is more of just a test at the moment to make sure I can get the multiply intrinsic to work correctly. Also the reason I declare matrix to be an array of 5 union dvec types is so that I can work with larger matrices than those that just hold 8 elements.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 16:59:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037397#M44524</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-10T16:59:40Z</dc:date>
    </item>
    <item>
      <title>you should try</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037398#M44525</link>
      <description>&lt;P&gt;you should try&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__declspec(align(64)) union dvec {

        __m512d v;
        double d[8];
};&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;However, I think you shouldn't use unions. There are betters ways to do this e.g.&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__m512d m[8];&lt;/PRE&gt;

&lt;P&gt;as a private member of the class. Then the matrix double constructor has to use&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt; __m512d m[0] = _mm512_set_pd( 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0 );&lt;/PRE&gt;

&lt;P&gt;Referring to your "Multiply" method:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;tmp = _mm512_setzero_pd();
&lt;/PRE&gt;

&lt;P&gt;is not needed. Also, the union tmp is only one __m512d type. I guess your constructor needs more then one __m512d as input?&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 18:52:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037398#M44525</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-10T18:52:45Z</dc:date>
    </item>
    <item>
      <title>btw you should have a look</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037399#M44526</link>
      <description>&lt;P&gt;btw you should have a look into a vector class library e.g.&lt;/P&gt;

&lt;P&gt;this one&amp;nbsp;&lt;A href="http://www.agner.org/optimize/vectorclass.zip" rel="noreferrer" style="box-sizing: border-box; color: rgb(65, 131, 196); font-family: Helvetica, arial, freesans, clean, sans-serif;"&gt;http://www.agner.org/optimize/vectorclass.zip&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;class Vec8f is located in vectorf256.h. Keep in mind this is a AVX library. You can use it only as an inspiration.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 19:13:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037399#M44526</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-10T19:13:27Z</dc:date>
    </item>
    <item>
      <title>If I get rid of the union how</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037400#M44527</link>
      <description>&lt;P&gt;If I get rid of the union how do I access individual elements of the vector for printing or other such things?&lt;/P&gt;

&lt;P&gt;Sorry if knowing that should be a basic thing. I'm still new to this and the only example code I have to go off the guy used unions to access the individual elements.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 19:19:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037400#M44527</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-10T19:19:55Z</dc:date>
    </item>
    <item>
      <title>You should avoid to access</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037401#M44528</link>
      <description>&lt;P&gt;You should avoid to access individual elements of a vector register. This is a very expensive operation, because there is no corresponding assembler instruction. You should always use masked vector operations e.g.&lt;/P&gt;

&lt;P&gt;add two __m512d but only the first 4 elements&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__m512d a, b;

a = _mm512_mask_add_pd( a, 0x0F, a, b);


// equivalent C operation

double a[8], b[8];

a[0] = a[0] + b[0]
a[1] = a[1] + b[1]
a[2] = a[2] + b[2]
a[3] = a[3] + b[3]
a[4] = a[4]
a[5] = a[5]
a[6] = a[6]
a[7] = a[7]&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;If you really want to access one element you can cast a __m512d type to a double pointer (__m512d is a union).&lt;/SPAN&gt;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__m512d a;

// print last element
std::cout &amp;lt;&amp;lt; ((double*)&amp;amp;a)[0] &amp;lt;&amp;lt; std::endl;
&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;However, you shouldn't do that in&amp;nbsp;&lt;/SPAN&gt;computationally&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;intensive parts of your program.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 20:46:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037401#M44528</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-10T20:46:00Z</dc:date>
    </item>
    <item>
      <title>So I'm trying to work with</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037402#M44529</link>
      <description>&lt;P&gt;So I'm trying to work with things without the union and I'm getting a seg. fault from the _m512_set_pd function. The code I have is as follows:&lt;/P&gt;

&lt;P&gt;In the header&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;private:
     __m512d matrix[5];&lt;/PRE&gt;

&lt;P&gt;and in the .cpp file&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;Matrix::Matrix(double d0, double d1, double d2, double d3,
                       double d4, double d5, double d6, double d7, ... more doubles){
     matrix[0] = _mm512_set_pd(d0,d1,d2,d3,d4,d5,d6,d7);  // line where seg. fault occurs
     // try to set values for the other 4 indices but the code doesn't get this far
}&lt;/PRE&gt;

&lt;P&gt;I'm not sure why this would give me a seg. fault. Isn't __m512d 64 byte aligned by definition?&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 21:47:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037402#M44529</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-10T21:47:03Z</dc:date>
    </item>
    <item>
      <title>Insert a sanity test to</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037403#M44530</link>
      <description>&lt;P&gt;Insert a sanity test to assert that things you think are aligned are aligned. And if they aren't figure out why and how to fix.&lt;/P&gt;

&lt;P&gt;Also: &lt;FONT face="Courier New"&gt;Matrix Matrix::Multiply&lt;/FONT&gt;&lt;/P&gt;

&lt;P&gt;&lt;FONT face="Courier New"&gt;May end up returning value on stack (which may not be aligned)&lt;/FONT&gt;&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 21:54:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037403#M44530</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-06-10T21:54:38Z</dc:date>
    </item>
    <item>
      <title>The constructor should like</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037404#M44531</link>
      <description>&lt;P&gt;The constructor should work like you posted it.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;Matrix::Matrix(double d0, double d1, double d2, double d3,
                       double d4, double d5, double d6, double d7) {

     matrix[0] = _mm512_set_pd(d0,d1,d2,d3,d4,d5,d6,d7);  
}&lt;/PRE&gt;

&lt;P&gt;I guess you have a problem within your class implementation. Can you post a program that I can compile.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;And yes your are correct with the alignment of a __m512d type. It is always aligned and should never cause e seg fault. Even as a return value. Here is the corresponding intel implementation&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#ifdef __INTEL_CLANG_COMPILER

typedef float   __m512  __attribute__((__vector_size__(64)));
typedef double  __m512d __attribute__((__vector_size__(64)));
typedef __int64 __m512i __attribute__((__vector_size__(64)));

#else
#if !defined(__INTEL_COMPILER) &amp;amp;&amp;amp; defined(_MSC_VER)
# define _MM512INTRIN_TYPE(X) __declspec(intrin_type)
#else
# define _MM512INTRIN_TYPE(X) _MMINTRIN_TYPE(X)
#endif


typedef union _MM512INTRIN_TYPE(64) __m512 {
    float       __m512_f32[16];
} __m512;

typedef union _MM512INTRIN_TYPE(64) __m512d {
    double      __m512d_f64[8];
} __m512d;

typedef union _MM512INTRIN_TYPE(64) __m512i {
    int         __m512i_i32[16];
} __m512i;

#endif /* __INTEL_CLANG_COMPILER */&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jun 2014 21:58:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037404#M44531</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-10T21:58:00Z</dc:date>
    </item>
    <item>
      <title>Here are the constructors and</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037405#M44532</link>
      <description>&lt;P&gt;Here are the constructors and some test code I have. I compiled using&lt;/P&gt;

&lt;PRE class="brush:bash;"&gt;icpc -g -mmic -o Test Test.cpp Matrix.cpp&lt;/PRE&gt;

&lt;P&gt;Let me know how it goes for you.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jun 2014 14:19:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037405#M44532</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-11T14:19:22Z</dc:date>
    </item>
    <item>
      <title>I think you will find:</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037406#M44533</link>
      <description>&lt;P&gt;I think you will find:&lt;/P&gt;

&lt;P&gt;Matrix* m1 = new Matrix(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25);&lt;/P&gt;

&lt;P&gt;did not return a 64-byt aligned object.&lt;/P&gt;

&lt;P&gt;You could consider using "placement new".&lt;/P&gt;

&lt;P&gt;Matrix* m1 = new(_mm_malloc(sizeof(Matrix), CACHE_LINE_SIZE)) Matrix;&lt;/P&gt;

&lt;P&gt;...&lt;/P&gt;

&lt;P&gt;_mm_free(m1);&lt;/P&gt;

&lt;P&gt;Caution, do not use delete on object allocated with _mm_malloc. You could also overload new for objects of Matrix type.&lt;/P&gt;

&lt;P&gt;The above did not test for allocation failure.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jun 2014 16:56:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037406#M44533</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2014-06-11T16:56:36Z</dc:date>
    </item>
    <item>
      <title>jim is correct.</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037407#M44534</link>
      <description>&lt;P&gt;jim is correct.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;Matrix* m1 = new(_mm_malloc( sizeof(Matrix), 64 ))Matrix(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25);

&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;cout &amp;lt;&amp;lt; "Matrix m1 initialized!" &amp;lt;&amp;lt; endl;&lt;/SPAN&gt;

_mm_free(m1);&lt;/PRE&gt;

&lt;P&gt;this works. I tested it. However I'm not sure what happens when you allocate a __m512 type on the heap and then access it via your class interface. Does someone know that?&lt;/P&gt;

&lt;P&gt;I would recommend to allocate your matrix array like that:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;double *m = _mm_malloc( 8*5 * sizeof(double), 64 );&lt;/PRE&gt;

&lt;P&gt;and when ever you want to access those elements use&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;__m512d v = _mm512_load_pd( m ) // loads the first 8 doubles
__m512d w = _mm512_load_pd( m + 8 ) // loads the next 8 doubles&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jun 2014 17:09:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037407#M44534</guid>
      <dc:creator>Patrick_S_</dc:creator>
      <dc:date>2014-06-11T17:09:06Z</dc:date>
    </item>
    <item>
      <title>Yep that fixed it!</title>
      <link>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037408#M44535</link>
      <description>&lt;P&gt;Yep that fixed it!&lt;/P&gt;

&lt;P&gt;Thanks for the help!&lt;/P&gt;</description>
      <pubDate>Wed, 11 Jun 2014 18:44:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Archive/Seg-fault-when-using-m512-mul-pd/m-p/1037408#M44535</guid>
      <dc:creator>Keegan_S_</dc:creator>
      <dc:date>2014-06-11T18:44:34Z</dc:date>
    </item>
  </channel>
</rss>

