topic If your 3 property vectors in Intel® Moderncode for Parallel Architectures

writing data structures for parallel coding.

owen_h_ — Sat, 07 Nov 2015 03:44:46 GMT

I initially wrote this question on SO but it seems nobody over there understands what I am asking. Searching on the net lead me here, so I'll ask here instead.

I am trying to wrap my mind around SOA [structure of arrays] in c programming.

I have some simple math functions which I have written what I believe is pretty decent scalar implementations.

here is a simple vector 3 data structure

struct vector3_scalar {
float p[3];
};

and a typical function that I have written to add two of these vector 3 data structures.

struct vector3_scalar* vec3_add(struct vector3_scalar* out,
const struct vector3_scalar* a,
const struct vector3_scalar* b) {
out->p[0] = a->p[0] + b->p[0];
out->p[1] = a->p[1] + b->p[1];
out->p[2] = a->p[2] + b->p[2];

return out;
}

I know this simple data structure isn't padded correctly but for scalar I just wanted to get something that worked first before I started implementing other features.

now my question, is that structure 'sans padding issues' a good way to setup the data structure?

what about these instead?

struct vector3_scalar {
float p[3];
};

struct vector3_scalar {
float px;
float py;
float pz;
};

or any other way that I could lay out the data. I personally don't mind flipping the data structures since users of this math shouldn't have to go this low and mess with this code once it has been written and optimized, just the higher level functions such as;

vec3 *a = vec3_create(0, 1, 0);
vec3 *b = vec3_create(1, 0, 0);
vec3 *c = vec3_zero(void);

vec3* vec3_add(vec3* out, const vec3* in_a, const vec3* in_b);

c = vec3_add(c, a, b); // c == 1, 1, 0

so that you can use the function inline or by itself.

vec3 *d = vec3_create(10, 10, 10);
vec3 *e = vec3_create(1, 1, 1);
vec3 *f = vec3_zero();

/* c + d = 11, 11, 10 */ /* c + e = 2, 2, 1 */
vec3_add(f, vec3_add(c, d), vec3_add(c, e));

vec3_free(a);
...
vec3_free(f);

so as you can see from the public api, the underlying structures shouldn't really matter except for the implementer.

I would like to write the basic scalar version which I already wrote using the data layout like this:

struct vector3_scalar {
float p[3];
}

but I am open to changing that now that it works and seems stable enough for my taste.

If your 3 property vectors

jimdempseyatthecove — Sun, 08 Nov 2015 12:58:57 GMT

If your 3 property vectors are preponderantly disorganized, then keeping them as a 3-property vector (your vector3_scalar) is fine.

However, if you have a large number of points, and they interact with either external forces or internal forces, then consider placing all the same properties of each particle into a single vector

float px[nParticles];
float py[nParticles];
float pz[nParticles];

While this may seem to require more work when manipulating individual particles, the format is more suitable for vectorization by the compiler. Some operations on a system with AVX could process 8 particle properties in a single in a single instruction.

Jim Dempsey

Thank you for the reply,

owen_h_ — Sun, 08 Nov 2015 13:48:42 GMT

Thank you for the reply, currently these scalar vector3 are used for rendering and I have solidified it to the point where "I think" it would be practical to do this just yet. Although seeing it written out like this does give me some ideas and things to keep in mind as I keep going forward.

For example let's say I want to deal with 512 of these particles for example. I would define the structure like this:

typedef struct particle {

float px[512];

float py[512];

float pz[512];

};

and use that in my calculations.

The 512 would be the max value and if I use less I would be wasting memory, is that correct?

struct ParticleXYZ_t

jimdempseyatthecove — Sun, 08 Nov 2015 23:47:35 GMT

struct ParticleXYZ_t
{
float* x;
float*y;
float* z;
ParticleXYZ_t() { x = y = z = NULL; }
void init(size_t n)
{
    x = new float;
    y = new float;
   z = new float;
}
};

struct ParticleBag_t
{
   ParticleXYZ_t() pos;
   ParticleXYZ_t() vel;
   ParticleXYZ_t() acc;
... // other properties here
   ParticleBag_t(size_t n)
{
     pos.init(n);
     vel.init(n);
     acc.init(n);
     ... // init other properties here
};

Jim Dempsey

float* px;
float* py;
float*

That code sample was really

owen_h_ — Tue, 10 Nov 2015 06:16:52 GMT

That code sample was really helpful, it's quite a lot to take in but for now I have a place to start writing code and solidifying this in my mind, thank you.

If I run into complications I'll add to this later, If I write something i'll add it here as well.

SOA data type organisation is

Bernard — Tue, 15 Dec 2015 18:45:10 GMT

SOA data type organisation is preferred when writing the code just as in your example.Mainly from the point of view of caching and data pre-fetching.