<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic If your 3 property vectors in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052576#M6818</link>
    <description>&lt;P&gt;If your 3 property vectors are preponderantly disorganized, then keeping them as a 3-property vector (your vector3_scalar) is fine.&lt;/P&gt;

&lt;P&gt;However, if you have a large number of points, and they interact with either external forces or internal forces, then consider placing all the same properties of each particle into a single vector&lt;/P&gt;

&lt;P&gt;float px[nParticles];&lt;BR /&gt;
	float py[nParticles];&lt;BR /&gt;
	float pz[nParticles];&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;While this may seem to require more work when manipulating individual particles, the format is more suitable for vectorization by the compiler. Some operations on a system with AVX could process 8 particle properties&amp;nbsp;in a single in a single instruction.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
    <pubDate>Sun, 08 Nov 2015 12:58:57 GMT</pubDate>
    <dc:creator>jimdempseyatthecove</dc:creator>
    <dc:date>2015-11-08T12:58:57Z</dc:date>
    <item>
      <title>writing data structures for parallel coding.</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052575#M6817</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I initially wrote this question on &lt;A href="http://stackoverflow.com/questions/33570487/pod-structure-of-array"&gt;SO &lt;/A&gt;but it seems nobody over there understands what I am asking. Searching on the net lead me here, so I'll ask here instead.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I am trying to wrap my mind around SOA [structure of arrays] in c programming.&lt;/P&gt;

&lt;P&gt;I have some simple math functions which I have written what I believe is pretty decent scalar implementations.&lt;/P&gt;

&lt;P&gt;here is a simple vector 3 data structure&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; struct vector3_scalar {&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float p[3];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; };&lt;/P&gt;

&lt;P&gt;and a typical function that I have written to add two of these vector 3 data structures.&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; struct vector3_scalar* vec3_add(struct vector3_scalar* out,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; const struct vector3_scalar* a,&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; const struct vector3_scalar* b) {&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; out-&amp;gt;p[0] = a-&amp;gt;p[0] + b-&amp;gt;p[0];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; out-&amp;gt;p[1] = a-&amp;gt;p[1] + b-&amp;gt;p[1];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; out-&amp;gt;p[2] = a-&amp;gt;p[2] + b-&amp;gt;p[2];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return out;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; }&lt;/P&gt;

&lt;P&gt;I know this simple data structure isn't padded correctly but for scalar I just wanted to get something that worked first before I started implementing other features.&lt;/P&gt;

&lt;P&gt;now my question, is that structure 'sans padding issues' a good way to setup the data structure?&lt;/P&gt;

&lt;P&gt;what about these instead?&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; struct vector3_scalar {&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float p[3];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; };&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; struct vector3_scalar {&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float px;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float py;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; float pz;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; };&lt;/P&gt;

&lt;P&gt;or any other way that I could lay out the data. I personally don't mind flipping the data structures since &amp;nbsp;users of this math shouldn't have to go this low and mess with this code once it has been written and optimized, just the higher level functions such as;&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; vec3 *a = vec3_create(0, 1, 0);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3 *b = vec3_create(1, 0, 0);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3 *c = vec3_zero(void);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3* vec3_add(vec3* out, const vec3* in_a, const vec3* in_b);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; c = vec3_add(c, a, b); // c == 1, 1, 0&lt;/P&gt;

&lt;P&gt;so that you can &amp;nbsp;use the function inline or by itself.&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; vec3 *d = vec3_create(10, 10, 10);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3 *e = vec3_create(1, 1, 1);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3 *f = vec3_zero();&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; /* c + d = 11, 11, 10 */ &amp;nbsp;/* c + e = 2, 2, 1 */&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3_add(f, vec3_add(c, d), vec3_add(c, e));&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; vec3_free(a);&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; ...&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; vec3_free(f);&lt;/P&gt;

&lt;P&gt;so as you can see from the public api, the underlying structures shouldn't really matter except for the implementer.&lt;/P&gt;

&lt;P&gt;I would like to write the basic scalar version which I already wrote using the data layout like this:&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; struct vector3_scalar {&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;float p[3];&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; }&lt;/P&gt;

&lt;P&gt;but I am open to changing that now that it works and seems stable enough for my taste.&lt;/P&gt;</description>
      <pubDate>Sat, 07 Nov 2015 03:44:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052575#M6817</guid>
      <dc:creator>owen_h_</dc:creator>
      <dc:date>2015-11-07T03:44:46Z</dc:date>
    </item>
    <item>
      <title>If your 3 property vectors</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052576#M6818</link>
      <description>&lt;P&gt;If your 3 property vectors are preponderantly disorganized, then keeping them as a 3-property vector (your vector3_scalar) is fine.&lt;/P&gt;

&lt;P&gt;However, if you have a large number of points, and they interact with either external forces or internal forces, then consider placing all the same properties of each particle into a single vector&lt;/P&gt;

&lt;P&gt;float px[nParticles];&lt;BR /&gt;
	float py[nParticles];&lt;BR /&gt;
	float pz[nParticles];&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;While this may seem to require more work when manipulating individual particles, the format is more suitable for vectorization by the compiler. Some operations on a system with AVX could process 8 particle properties&amp;nbsp;in a single in a single instruction.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Sun, 08 Nov 2015 12:58:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052576#M6818</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-11-08T12:58:57Z</dc:date>
    </item>
    <item>
      <title>Thank you for the reply,</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052577#M6819</link>
      <description>&lt;P&gt;Thank you for the reply, currently these scalar vector3 are used for rendering and I have solidified it to the point where "I think" it would be practical to do this just yet. Although seeing it written out like this does give me some ideas and things to keep in mind as I keep going forward.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;For example let's say I want to deal with 512 of these particles for example. I would define the structure like this:&lt;/P&gt;

&lt;P&gt;typedef struct particle {&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;float px[512];&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;float py[512];&lt;/P&gt;

&lt;P&gt;float pz[512];&lt;/P&gt;

&lt;P&gt;};&lt;/P&gt;

&lt;P&gt;and use that in my calculations.&lt;/P&gt;

&lt;P&gt;The 512 would be the max value and if I use less I would be wasting memory, is that correct?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 08 Nov 2015 13:48:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052577#M6819</guid>
      <dc:creator>owen_h_</dc:creator>
      <dc:date>2015-11-08T13:48:42Z</dc:date>
    </item>
    <item>
      <title>struct ParticleXYZ_t</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052578#M6820</link>
      <description>&lt;P&gt;struct ParticleXYZ_t&lt;BR /&gt;
	{&lt;BR /&gt;
	&amp;nbsp; float* x;&lt;BR /&gt;
	&amp;nbsp; float*y;&lt;BR /&gt;
	&amp;nbsp; float* z;&lt;BR /&gt;
	&amp;nbsp; ParticleXYZ_t() { x = y = z = NULL; }&lt;BR /&gt;
	&amp;nbsp; void init(size_t n)&lt;BR /&gt;
	&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; x = new float&lt;N&gt;;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; y = new float&lt;N&gt;;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;z = new float&lt;N&gt;;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp;}&lt;BR /&gt;
	};&lt;/N&gt;&lt;/N&gt;&lt;/N&gt;&lt;/P&gt;

&lt;P&gt;struct ParticleBag_t&lt;BR /&gt;
	{&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; ParticleXYZ_t() pos;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; ParticleXYZ_t() vel;&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; ParticleXYZ_t() acc;&lt;BR /&gt;
	... // other properties here&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; ParticleBag_t(size_t n)&lt;BR /&gt;
	&amp;nbsp; {&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; pos.init(n);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; vel.init(n);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; acc.init(n);&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ... // init other properties here&lt;BR /&gt;
	};&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	float* px;&lt;BR /&gt;
	float* py;&lt;BR /&gt;
	float*&lt;/P&gt;</description>
      <pubDate>Sun, 08 Nov 2015 23:47:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052578#M6820</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-11-08T23:47:35Z</dc:date>
    </item>
    <item>
      <title>That code sample was really</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052579#M6821</link>
      <description>&lt;P&gt;That code sample was really helpful, it's quite a lot to take in but for now I have a place to start writing code and solidifying this in my mind, thank you.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;If I run into complications I'll add to this later, If I write something i'll add it here as well.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Nov 2015 06:16:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052579#M6821</guid>
      <dc:creator>owen_h_</dc:creator>
      <dc:date>2015-11-10T06:16:52Z</dc:date>
    </item>
    <item>
      <title>SOA data type organisation is</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052580#M6822</link>
      <description>&lt;P&gt;SOA data type organisation is preferred when writing the code just as in your example.Mainly from the point of view of caching and data pre-fetching.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Dec 2015 18:45:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/writing-data-structures-for-parallel-coding/m-p/1052580#M6822</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2015-12-15T18:45:10Z</dc:date>
    </item>
  </channel>
</rss>

