Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

data placement

karimfath
Beginner
1,151 Views

hello

can i mesure the cost when a thread access data in l2 or ram to a do write or read operation.to evaluate difference between the two operation when data is in l2 and data is in ram in centrino duo system

thanks

0 Kudos
1 Solution
Dmitry_Vyukov
Valued Contributor I
1,151 Views
Quoting - karimfath

can i mesure the cost when a thread access data in l2 or ram to a do write or read operation.to evaluate difference between the two operation when data is in l2 and data is in ram in centrino duo system

You must make something like this:


- start 2 threads, bind them to different cores

- create following object:

struct X
{

char pad1 [128];

int data;

char pad2 [128];

};

X* g_x = new X;

- create semaphore for communication between threads

In order to measure cost when data is in L2$ and in foreign L1$:

1. First thread writes to g_x->data, and signals semaphore

2. Second thread wakes up and loads g_x->data

In order to measure cost when data is in RAM, you must flush data from cache with clflush instruction.

View solution in original post

0 Kudos
3 Replies
TimP
Honored Contributor III
1,151 Views
Quoting - karimfath

hello

can i mesure the cost when a thread access data in l2 or ram to a do write or read operation.to evaluate difference between the two operation when data is in l2 and data is in ram in centrino duo system

thanks

Have you considered to what extent PTU will help with your investigation?

http://software.intel.com/en-us/articles/intel-performance-tuning-utility-31-update-3

0 Kudos
Dmitry_Vyukov
Valued Contributor I
1,152 Views
Quoting - karimfath

can i mesure the cost when a thread access data in l2 or ram to a do write or read operation.to evaluate difference between the two operation when data is in l2 and data is in ram in centrino duo system

You must make something like this:


- start 2 threads, bind them to different cores

- create following object:

struct X
{

char pad1 [128];

int data;

char pad2 [128];

};

X* g_x = new X;

- create semaphore for communication between threads

In order to measure cost when data is in L2$ and in foreign L1$:

1. First thread writes to g_x->data, and signals semaphore

2. Second thread wakes up and loads g_x->data

In order to measure cost when data is in RAM, you must flush data from cache with clflush instruction.

0 Kudos
DDd1
Beginner
1,151 Views
Quoting - Dmitriy V'jukov

You must make something like this:


- start 2 threads, bind them to different cores

- create following object:

struct X
{

char pad1 [128];

int data;

char pad2 [128];

};

X* g_x = new X;

- create semaphore for communication between threads

In order to measure cost when data is in L2$ and in foreign L1$:

1. First thread writes to g_x->data, and signals semaphore

2. Second thread wakes up and loads g_x->data

In order to measure cost when data is in RAM, you must flush data from cache with clflush instruction.

Nice tip. Does anyone know of a small utility library that packs together small tips like this to do parallel performance measurements, and instrumentation in our own applications? Probably something they have laying around and can be of use to others ;)

0 Kudos
Reply