Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Cache Coherency

jimdempseyatthecove
Honored Contributor III
391 Views

Hello

Given a cache line is 16 bytes wide (depending on processor)

Given in a multi-processor system (separate caches)

A shared variable cannot reliably be modified using

add dword ptr [edx], eax

without potentially producing the incorrect result.

A means to correct for this problem is to use the LOCK prefix

LOCK add dword ptr [edx], eax

This is easy to comprehend. (I know cmpxchg is typically used)

Consider

long Count1;

long Count2;

Where Count1 is exclusively used by one thread, and Count2 is exclusively used by a different thread.

The question is:

If Count1 and Count2 lie within the same 16 byte paragraph. Would the LOCK be required even though each 4-byte variable is exclusively used by only one thread?

If the LOCK is required then c onsider using OpenMP on an array of real(4)'s. Depending on memory placement of the array and how OpenMP divides up the work into stripes of the array you could potentialy have interactions at the ends of each stripe. Is this a concerne?

Jim

0 Kudos
1 Reply
Daniel_S_9
Beginner
391 Views

It will work fine, but be very slow, after each write, the cache line will be invalidated on the other core. That is called false sharing.

0 Kudos
Reply