Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

What is "load-acquire" and "store-release"?

lxconan
Beginner
4,581 Views

Briely, a store-release instruction will, at its completion, ensure that all previous
instructions are completed; a load-acquire instruction correspondingly ensures that
all following instructions will complete only after it completes. But these explanations
are far from precise.

Stil, there is other explanation, which is, the load-acquire semantics is

load;
#loadload | #loadstore

And the store-release semantics is

#loadstore | #storestore
store;

Can anybody explain this? I really don't quite catch it.

And the load_with_acquire implementation in TBB is

static inline T load_with_acquire(const volatile T& location) {
#if !defined(__INTEL_COMPILER) && _MSC_VER >= 1300
T to_return = location;
_ReadWriteBarrier();
return to_return;
#else
return location;
#endif
}

I am wondering why a _ReadWriteBarrier() is used. This function is not a memory fence, so according to http://msdn2.microsoft.com/en-us/library/12a04hfd(VS.80).aspx, is the function below equivalent?

static inline T load_with_acquire(const volatile T& location) {
#if !defined(__INTEL_COMPILER) && _MSC_VER >= 1300
volatile T to_return = location;
return to_return;
#else
return location;
#endif
}

0 Kudos
3 Replies
ARCH_R_Intel
Employee
4,581 Views

I plan to write a blog later this week on why volatile is almost useless for portable multi-threaded programming. It's a point that's only recently become clear to me, though it's been clear to others for a decade :-(

Indeed _ReadWriteBarrier is a memory barrier, according to http://msdn2.microsoft.com/en-us/library/f20w0x5e(VS.80).aspx. There are two players who have to obey memory barriers: the hardware and the compiler.So even on hardware that is sequentially consistent, a memory barrier is required to keep the compiler from doing unwanted code transformations.

The keyword volatile does not imply a memory fence. Compilers are free to reorder volatile accesses with respect to non-volatile aspects, and they do so in practice. So the proposed function is not equivalent. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html for a good summary of what volatile means.

The exception to the rule is Itanium, where volatile loads are defined as having acquire semantics and volatile stores are defined as having release semantics. But that's a cool feature of the Itanium world, and not generally applicable.

0 Kudos
ARCH_R_Intel
Employee
4,581 Views
0 Kudos
lxconan
Beginner
4,581 Views

Thanks very much! I've seen your blog, and "volatile" is also clear to me now :-)

As for Load acquire and store release semantics, I find some explanation in N1680 & N1876at www.open-std.org

0 Kudos
Reply