Consider the example code below,
int msg = 0;
P1 P2
msg = 1; r = msg;
Is it OK to write such code? Is there anypossibility that P2 read some "intermediate" value (i.e. neither 0 nor 1)?
I have researched on this topic for a while. However,I am still not sure if we are allowed to write the above code.
Some said we should use mutex.
Some saidthat a native-wordstore is atomic.
Some said a alignednative-wordstore is atomic.
Which one is right? Can programmers just write the above code, then assume that a native-word / aligned native-word store is atomic?
Any comment will be appreciated.
Link Copied
Thanks for your reply.
Let me ask in another way,
int msg = 0
P1P2
msg = 1; r = msg;
Can we write the above code and assume that "msg = 1" can be done atomically, no matter what compilers we use and no matterwhere platforms we run?
Or perhapswe shouldrely on the guaranty provided by programming langauages or special API?
// enqueue function
put_nonblock(...) {if (NULL == queue[head]) {
queue[head] = ptr;
head = NEXT(head);
}
}
// dequeue functionget_nonblock(...) {
if (NULL != queue[tail]) {
ptr = queue[tail];
queue[tail] = NULL;
tail = NEXT(tail);
}
}
I am surprised the FastForward can only rely the atomicity as the authors claimed. But I don't know what the authors claimed is right or not. Any comment will be appreciated....notice that this optimization also allows a CLF implementation that only depends on the coherence protocol (and not the consistency model) to ensure mutual exclusion.
To see this, recall that stores are atomic on all coherent memory systems, thus the getter and putter will never update the same node simultaneously since the getter waits for a non-NULL value and the setter a NULL value.
Where do you get the info?
According to ISO C/C++ volatile has nothing to do with ordering. Even if a var is volatile, compilers are free to rearrange stores around them. For sure.
Hey, wait, the following code is just incorrect:
if (NULL != queue[tail]) {
ptr = queue[tail];
queue[tail] = NULL;
tail = NEXT(tail);
}
... until it's C++ where the code can have any meaning... or Microsoft C and vars are declared volatile... anyway it's incorrect w/o fences.
// enqueue function
put_nonblock(...) {
if (NULL == queue[head]) {
queue[head] = ptr;
head = NEXT(head);
}
}
// dequeue function
get_nonblock(...) {
if (NULL != queue[tail]) {
ptr = queue[tail];
queue[tail] = NULL;
tail = NEXT(tail);
}
}
[cpp]// enqueue function Thread 1 Thread 2 put_nonblock(...) { if (NULL == queue[head]) { Finds queue[head] NULL Instant later does the same queue[head] = ptr; queue[head] = ptr1 Overwrites ptr1 with ptr2 head = NEXT(head); head = head2 head = head3 } }[/cpp]
[cpp] size_t head;
size_t tail;
atomic_address queue [QUEUE_LENGTH];
put_nonblock(...)
{
if (atomic_load_explicit(&queue[head], memory_order_relaxed) == 0)
{
atomic_store_explicit(&queue[head], ptr, memory_order_release);
head = NEXT(head);
}
}
get_nonblock(...)
{
if (atomic_load_explicit(&queue[tail], memory_order_relaxed) != 0)
{
ptr = atomic_load_explicit(&queue[tail], memory_order_consume);
atomic_store_explicit(&queue[tail], 0, memory_order_relaxed);
tail = NEXT(tail);
}
}
[/cpp]
"ptr = atomic_load_explicit(&queue[tail], memory_order_consume);"
I think that should be memory_order_acquire, shouldn't it? But I still don't "get" memory_order_consume, so maybe it's just me...
For more complete information about compiler optimizations, see our Optimization Notice.