Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development Topics
- Intel® Moderncode for Parallel Architectures
- writing data structures for parallel coding.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

owen_h_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-06-2015
07:44 PM

52 Views

writing data structures for parallel coding.

I initially wrote this question on SO but it seems nobody over there understands what I am asking. Searching on the net lead me here, so I'll ask here instead.

I am trying to wrap my mind around SOA [structure of arrays] in c programming.

I have some simple math functions which I have written what I believe is pretty decent scalar implementations.

here is a simple vector 3 data structure

struct vector3_scalar {

float p[3];

};

and a typical function that I have written to add two of these vector 3 data structures.

struct vector3_scalar* vec3_add(struct vector3_scalar* out,

const struct vector3_scalar* a,

const struct vector3_scalar* b) {

out->p[0] = a->p[0] + b->p[0];

out->p[1] = a->p[1] + b->p[1];

out->p[2] = a->p[2] + b->p[2];

return out;

}

I know this simple data structure isn't padded correctly but for scalar I just wanted to get something that worked first before I started implementing other features.

now my question, is that structure 'sans padding issues' a good way to setup the data structure?

what about these instead?

struct vector3_scalar {

float p[3];

};

struct vector3_scalar {

float px;

float py;

float pz;

};

or any other way that I could lay out the data. I personally don't mind flipping the data structures since users of this math shouldn't have to go this low and mess with this code once it has been written and optimized, just the higher level functions such as;

vec3 *a = vec3_create(0, 1, 0);

vec3 *b = vec3_create(1, 0, 0);

vec3 *c = vec3_zero(void);

vec3* vec3_add(vec3* out, const vec3* in_a, const vec3* in_b);

c = vec3_add(c, a, b); // c == 1, 1, 0

so that you can use the function inline or by itself.

vec3 *d = vec3_create(10, 10, 10);

vec3 *e = vec3_create(1, 1, 1);

vec3 *f = vec3_zero();

/* c + d = 11, 11, 10 */ /* c + e = 2, 2, 1 */

vec3_add(f, vec3_add(c, d), vec3_add(c, e));

vec3_free(a);

...

vec3_free(f);

so as you can see from the public api, the underlying structures shouldn't really matter except for the implementer.

I would like to write the basic scalar version which I already wrote using the data layout like this:

struct vector3_scalar {

float p[3];

}

but I am open to changing that now that it works and seems stable enough for my taste.

Link Copied

5 Replies

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-08-2015
04:58 AM

52 Views

If your 3 property vectors are preponderantly disorganized, then keeping them as a 3-property vector (your vector3_scalar) is fine.

However, if you have a large number of points, and they interact with either external forces or internal forces, then consider placing all the same properties of each particle into a single vector

float px[nParticles];

float py[nParticles];

float pz[nParticles];

While this may seem to require more work when manipulating individual particles, the format is more suitable for vectorization by the compiler. Some operations on a system with AVX could process 8 particle properties in a single in a single instruction.

Jim Dempsey

owen_h_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-08-2015
05:48 AM

52 Views

Thank you for the reply, currently these scalar vector3 are used for rendering and I have solidified it to the point where "I think" it would be practical to do this just yet. Although seeing it written out like this does give me some ideas and things to keep in mind as I keep going forward.

For example let's say I want to deal with 512 of these particles for example. I would define the structure like this:

typedef struct particle {

float px[512];

float py[512];

float pz[512];

};

and use that in my calculations.

The 512 would be the max value and if I use less I would be wasting memory, is that correct?

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-08-2015
03:47 PM

52 Views

struct ParticleXYZ_t

{

float* x;

float*y;

float* z;

ParticleXYZ_t() { x = y = z = NULL; }

void init(size_t n)

{

x = new float

y = new float

z = new float

}

};

struct ParticleBag_t

{

ParticleXYZ_t() pos;

ParticleXYZ_t() vel;

ParticleXYZ_t() acc;

... // other properties here

ParticleBag_t(size_t n)

{

pos.init(n);

vel.init(n);

acc.init(n);

... // init other properties here

};

Jim Dempsey

float* px;

float* py;

float*

owen_h_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-09-2015
10:16 PM

52 Views

That code sample was really helpful, it's quite a lot to take in but for now I have a place to start writing code and solidifying this in my mind, thank you.

If I run into complications I'll add to this later, If I write something i'll add it here as well.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-15-2015
10:45 AM

52 Views

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.