Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

atomic boolean initializes to true?

kfriddile
Beginner
3,611 Views

According to my debugger, atomic variables are initialized to true. This was counter-intuitive to me since atomic numeric types are initialized to zero. What is the rationale for initializing to true? I'm using the0080605 release of TBB with VC9.

0 Kudos
58 Replies
Alexey-Kukanov
Employee
569 Views
Quoting - Dmitriy V'jukov

Assignment operator have to make load *acquire* from source operand, so it will also break on Itanium platform.


I agree. So in case of passing rhs by value, compiler-generated copy constructor will not issue load-acquire for rhs but a simple load; then the tbb::atomic implementation will enforce load-acquire on the temporary copy, and store-release to the destination. Thanks Dmitry.

By the way, the fix is ready and will be published in the next development update.

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

"I think I start getting your point. You mean that load from source object will NOT be atomic nor acquire. Right? I was confused by your statement that it's not Ok to declare assignment operator as receiving operand by value." Indeed, the load may not be atomic (in the original sense of the word). I'm not concerned about acquire, which logically occurs after the load, so an intermediate copy wouldn't hurt, whatever its memory semantics. But my question about those unintended copies has not been addressed yet: can anything still go wrong?

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

I'm not concerned about acquire, which logically occurs after the load, so an intermediate copy wouldn't hurt, whatever its memory semantics.

I am not sure here.

On platforms where memory fences are separate and "global" (for example, Sparc, where acquire is membar #LoadLoad | #LoadStore) your reasoning is correct. But on platforms where fences are integrated part of memory operations and "local" (for example, Itanium, where acquire is ld.acq) your reasoning can be not correct.

Quoting - Raf Schietekat

But my question about those unintended copies has not been addressed yet: can anything still go wrong?

Can you elaborate a bit more here? What copies? Can you provide some examples of what are you talking about?

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

"But on platforms where fences are integrated part of memory operations and "local" (for example, Itanium, where acquire is ld.acq) your reasoning can be not correct." It makes no difference whether the fence is part of the instruction.

"Can you elaborate a bit more here? What copies? Can you provide some examples of what are you talking about?" See #13 and #18.

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat


"Can you elaborate a bit more here? What copies? Can you provide some examples of what are you talking about?" See #13 and #18.

The interface of tbb::atomic<> is quite simple and minimalistic, I don't see any other places where similar problem can occur.

MSVC is able to call 2 user defined conversion operators implicitly, i.e. user defined cast operator followed by implicit constructor. But atomic<> doesn't have any (non copy) constructors...

Btw, I think that it's possible to just remove assignment operator which gets 'atomic const& rhs', because there is assignment operator which gets 'T rhs', and atomic has cast operator to T which makes load-acquire. So assignment of one atomic to another atomic will still work.... Hmm...


0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

"But on platforms where fences are integrated part of memory operations and "local" (for example, Itanium, where acquire is ld.acq) your reasoning can be not correct." It makes no difference whether the fence is part of the instruction.

Do you mean the fact that TBB's load-acquire is stronger than C++0x's load-acquire and incurs unnecessary overheads in some cases?

I really don't know what are semantics of TBB's load-acquire. Where are formal semantics? :)

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

"Btw, I think that it's possible to just remove assignment operator which gets 'atomic const& rhs', because there is assignment operator which gets 'T rhs', and atomic has cast operator to T which makes load-acquire. So assignment of one atomic to another atomic will still work.... Hmm..." It would be nice to restore the symmetry between copy constructor and copy assignment operator, which should normally be defined together, but the former cannot be defined because it would also require a user-defined default constructor which would break the zero-initialisation expectations, and the latter is required because otherwise an incorrect implicit copy assignment operator would be defined.

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

"Do you mean the fact that TBB's load-acquire is stronger than C++0x's load-acquire and incurs unnecessary overheads in some cases?" Why stronger or more expensive? That's just how it works: release-store and load-acquire, each time logically in that order, whether integrated in an instruction or not. That means that copy-load-acquire, with the load from the otherwise invisible copied value, has the same semantics as the original load-acquire. Not so?

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

"Btw, I think that it's possible to just remove assignment operator which gets 'atomic const& rhs', because there is assignment operator which gets 'T rhs', and atomic has cast operator to T which makes load-acquire. So assignment of one atomic to another atomic will still work.... Hmm..." It would be nice to restore the symmetry between copy constructor and copy assignment operator, which should normally be defined together, but the former cannot be defined because it would also require a user-defined default constructor which would break the zero-initialisation expectations, and the latter is required because otherwise an incorrect implicit copy assignment operator would be defined.

Arghhh!

You are totally right!

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

"Do you mean the fact that TBB's load-acquire is stronger than C++0x's load-acquire and incurs unnecessary overheads in some cases?" Why stronger or more expensive? That's just how it works: release-store and load-acquire, each time logically in that order, whether integrated in an instruction or not. That means that copy-load-acquire, with the load from the otherwise invisible copied value, has the same semantics as the original load-acquire. Not so?

Yes, they are logically in that order. But this doesn't mean that they have global effect. Load-acquire is acquire only on that location that is loaded. At least this is how it works in C++0x. And they have reasons for such semantics. I think that "global" acquire can not be implemented efficiently on some architectures.

0 Kudos
RafSchietekat
Valued Contributor III
569 Views
Quoting - Dmitriy V'jukov
Quoting - Raf Schietekat

"Do you mean the fact that TBB's load-acquire is stronger than C++0x's load-acquire and incurs unnecessary overheads in some cases?" Why stronger or more expensive? That's just how it works: release-store and load-acquire, each time logically in that order, whether integrated in an instruction or not. That means that copy-load-acquire, with the load from the otherwise invisible copied value, has the same semantics as the original load-acquire. Not so?

Yes, they are logically in that order. But this doesn't mean that they have global effect. Load-acquire is acquire only on that location that is loaded. At least this is how it works in C++0x. And they have reasons for such semantics. I think that "global" acquire can not be implemented efficiently on some architectures.

So there's no causality chain of the message value for built-in fences? That's weird, are you sure about that?

0 Kudos
robert_jay_gould
Beginner
569 Views
Quoting - Dmitriy V'jukov

Yeah, it seems that it's a bug in MSVC (I've checked MSVC8 and 9). MSVC correctly zero-initializes POD types, but fails to do so for non-POD types w/o user-defined constructor (i.e. tbb::atomic<>).

g++ 3.4.6 correctly zero-initializes types like tbb::atomic<>.

Yikes!

I thought the standard said that zero initialization of non-POD types was not required by compilers. So I wouldn't call this a bug as much as a lacking feature in MSVC, that GCC gives us as a bonus. Although I'm not sure if g++ 4.x will do the zero initialization if you use -O3 flags for optimization. GCC has really done away with all the fluff, and is finally catching up on speed.

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

It does seem true that default-initialisation does not translate to zero-initialisation for non-POD objects, so for non-static atomics it would be necessary to explicitly assign a value (it is not enough to put my_atomic() in the initialiser list, etc.). So is this matter about incorrect zero-initialisation or about a misunderstanding of when zero-initialisation is meant to occur?

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

So there's no causality chain of the message value for built-in fences? That's weird, are you sure about that?

For sure!

N2798 1.10/7:

an atomic operation A that performs a release operation on an object M synchronizes with an
atomic operation B that performs an acquire operation on M and reads a value written by any
side effect in the release sequence headed by A.

I.e. acquire operation must be on the same variable on which there was release operation.

Although, for stand-alone fences there are different rules.

N2798 29.6/4:

An atomic operation A that is a release operation on an atomic object M synchronizes with an acquire fence
B if there exists some atomic operation X on M such that X is sequenced before B and reads the value
written by A or a value written by any side effect in the release sequence headed by A.

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views

I thought the standard said that zero initialization of non-POD types was not required by compilers. So I wouldn't call this a bug as much as a lacking feature in MSVC, that GCC gives us as a bonus. Although I'm not sure if g++ 4.x will do the zero initialization if you use -O3 flags for optimization. GCC has really done away with all the fluff, and is finally catching up on speed.

ISO/IEC 14882:2003

8.5/7:

An object whose initializer is an empty set of parentheses, i.e., (), shall be value-initialized.

8.5/5:

To value-initialize an object of type T means:
- if T is a class type (clause 9) with a user-declared constructor (12.1), then the default constructor for T is
called (and the initialization is ill-formed if T has no accessible default constructor);
- if T is a non-union class type without a user-declared constructor, then every non-static data member
and base-class component of T is value-initialized;

- if T is an array type, then each element is value-initialized;
- otherwise, the object is zero-initialized

To zero-initialize an object of type T means:
- if T is a scalar type (3.9), the object is set to the value of 0 (zero) converted to T;
- if T is a non-union class type, each nonstatic data member and each base-class subobject is zeroinitialized;
- if T is a union type, the objects first named data member89) is zero-initialized;
- if T is an array type, each element is zero-initialized;
- if T is a reference type, no initialization is performed.

0 Kudos
Dmitry_Vyukov
Valued Contributor I
569 Views
Quoting - Raf Schietekat

It does seem true that default-initialisation does not translate to zero-initialisation for non-POD objects, so for non-static atomics it would be necessary to explicitly assign a value (it is not enough to put my_atomic() in the initialiser list, etc.). So is this matter about incorrect zero-initialisation or about a misunderstanding of when zero-initialisation is meant to occur?

ISO/IEC 14882:2003

8.5/7:

An object whose initializer is an empty set of parentheses, i.e., (), shall be value-initialized.

8.5/5:

To value-initialize an object of type T means:
- if T is a class type (clause 9) with a user-declared constructor (12.1), then the default constructor for T is
called (and the initialization is ill-formed if T has no accessible default constructor);
- if T is a non-union class type without a user-declared constructor, then every non-static data member
and base-class component of T is value-initialized;

- if T is an array type, then each element is value-initialized;
- otherwise, the object is zero-initialized

To zero-initialize an object of type T means:
- if T is a scalar type (3.9), the object is set to the value of 0 (zero) converted to T;
- if T is a non-union class type, each nonstatic data member and each base-class subobject is zeroinitialized;
- if T is a union type, the objects first named data member89) is zero-initialized;
- if T is an array type, each element is zero-initialized;
- if T is a reference type, no initialization is performed.

0 Kudos
RafSchietekat
Valued Contributor III
569 Views

"For sure!" I don't know, it seems very formal, but without examples I'm sceptical that the authors didn't trip over their own feet (they're only human, and who says enough iterations were made). For example, what is "an atomic operation B that performs an acquire operation on M *and* reads a value written by any side effect in the release sequence headed by A" (my *emphasis*): what kind of operation does all that at once? But perhaps more to the point: whatdo the hardware people say about this? Perhaps somebody from Intel about Itanium?

Sorry, I was looking at an earlier version of the C++ standard regarding initialisation; I'll check tonight whether I overlooked value-initialisation or whether it was added later.

0 Kudos
robert_jay_gould
Beginner
569 Views
Quoting - Dmitriy V'jukov

ISO/IEC 14882:2003

8.5/7:

An object whose initializer is an empty set of parentheses, i.e., (), shall be value-initialized.

Aha! Looking at atomic.h

bool specialization has no constructor.

void* specialization has no constructor.

Neither inherits from atomic_impl, this means:

- if T is a class type (clause 9) with a user-declared constructor (12.1), then the default constructor for T is called (and the initialization is ill-formed if T has no accessible default constructor);

Mystery solved?

0 Kudos
RafSchietekat
Valued Contributor III
569 Views
"Mystery solved?" The referenced paragraph is about a class type *with* a user-declared constructor, so it is not applicable here.

0 Kudos
Dmitry_Vyukov
Valued Contributor I
543 Views
Quoting - Raf Schietekat

"For sure!" I don't know, it seems very formal, but without examples I'm sceptical that the authors didn't trip over their own feet (they're only human, and who says enough iterations were made). For example, what is "an atomic operation B that performs an acquire operation on M *and* reads a value written by any side effect in the release sequence headed by A" (my *emphasis*): what kind of operation does all that at once? But perhaps more to the point: whatdo the hardware people say about this? Perhaps somebody from Intel about Itanium?

Load-acquire does all that at once:

X.load(memory_order_acquire);

It's acquire operation on X AND it reads some value from X.

I don't know which platform causes such rules, but I beleive there are reasons for it. It must be some platform with fences combined with load/store instructions (Itanium? ld.acq, st.rel).

0 Kudos
RafSchietekat
Valued Contributor III
543 Views
"It's acquire operation on X AND it reads some value from X." That's not what the proposal says. See? What good are "formal semantics" if even their proponents don't understand them... :-)

0 Kudos
Reply