Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

writing data structures for parallel coding.

owen_h_
Beginner
814 Views

 

I initially wrote this question on SO but it seems nobody over there understands what I am asking. Searching on the net lead me here, so I'll ask here instead.

 

I am trying to wrap my mind around SOA [structure of arrays] in c programming.

I have some simple math functions which I have written what I believe is pretty decent scalar implementations.

here is a simple vector 3 data structure

    struct vector3_scalar {
        float p[3];
    };

and a typical function that I have written to add two of these vector 3 data structures.

    struct vector3_scalar* vec3_add(struct vector3_scalar* out,
                                    const struct vector3_scalar* a,
                                    const struct vector3_scalar* b) {
        out->p[0] = a->p[0] + b->p[0];
        out->p[1] = a->p[1] + b->p[1];
        out->p[2] = a->p[2] + b->p[2];
    
        return out;
    }

I know this simple data structure isn't padded correctly but for scalar I just wanted to get something that worked first before I started implementing other features.

now my question, is that structure 'sans padding issues' a good way to setup the data structure?

what about these instead?

    struct vector3_scalar {
        float p[3];
    };


    struct vector3_scalar {
        float px;
        float py;
        float pz;
    };

or any other way that I could lay out the data. I personally don't mind flipping the data structures since  users of this math shouldn't have to go this low and mess with this code once it has been written and optimized, just the higher level functions such as;

    vec3 *a = vec3_create(0, 1, 0);
    vec3 *b = vec3_create(1, 0, 0);
    vec3 *c = vec3_zero(void);
    
    vec3* vec3_add(vec3* out, const vec3* in_a, const vec3* in_b);
    
    c = vec3_add(c, a, b); // c == 1, 1, 0

so that you can  use the function inline or by itself.

    vec3 *d = vec3_create(10, 10, 10);
    vec3 *e = vec3_create(1, 1, 1);
    vec3 *f = vec3_zero();

    /* c + d = 11, 11, 10 */  /* c + e = 2, 2, 1 */
    vec3_add(f, vec3_add(c, d), vec3_add(c, e));

    vec3_free(a);
    ...
    vec3_free(f);

so as you can see from the public api, the underlying structures shouldn't really matter except for the implementer.

I would like to write the basic scalar version which I already wrote using the data layout like this:

    struct vector3_scalar {
         float p[3];
    }

but I am open to changing that now that it works and seems stable enough for my taste.

0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
814 Views

If your 3 property vectors are preponderantly disorganized, then keeping them as a 3-property vector (your vector3_scalar) is fine.

However, if you have a large number of points, and they interact with either external forces or internal forces, then consider placing all the same properties of each particle into a single vector

float px[nParticles];
float py[nParticles];
float pz[nParticles];

 

While this may seem to require more work when manipulating individual particles, the format is more suitable for vectorization by the compiler. Some operations on a system with AVX could process 8 particle properties in a single in a single instruction.

Jim Dempsey

0 Kudos
owen_h_
Beginner
814 Views

Thank you for the reply, currently these scalar vector3 are used for rendering and I have solidified it to the point where "I think" it would be practical to do this just yet. Although seeing it written out like this does give me some ideas and things to keep in mind as I keep going forward.

 

For example let's say I want to deal with 512 of these particles for example. I would define the structure like this:

typedef struct particle {

float px[512];

float py[512];

float pz[512];

};

and use that in my calculations.

The 512 would be the max value and if I use less I would be wasting memory, is that correct?

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
814 Views

struct ParticleXYZ_t
{
  float* x;
  float*y;
  float* z;
  ParticleXYZ_t() { x = y = z = NULL; }
  void init(size_t n)
  {
    x = new float;
    y = new float;
    z = new float;
   }
};

struct ParticleBag_t
{
   ParticleXYZ_t() pos;
   ParticleXYZ_t() vel;
   ParticleXYZ_t() acc;
... // other properties here
   ParticleBag_t(size_t n)
  {
     pos.init(n);
     vel.init(n);
     acc.init(n);
     ... // init other properties here
};

Jim Dempsey


float* px;
float* py;
float*

0 Kudos
owen_h_
Beginner
814 Views

That code sample was really helpful, it's quite a lot to take in but for now I have a place to start writing code and solidifying this in my mind, thank you.

 

If I run into complications I'll add to this later, If I write something i'll add it here as well.

0 Kudos
Bernard
Valued Contributor I
814 Views

SOA data type organisation is preferred when writing the code just as in your example.Mainly from the point of view of caching and data pre-fetching.

0 Kudos
Reply