A optimization problem on atomic<T> - Page 2

azru0512 · ‎07-30-2010

The following is a producer/consumer example.

atomic in, out;
in= 0;
out= 0;

P1P2

while ( (in + 1) %capacity== out);while ( in == out ) ;

/*do something*/ /* do something */

in = (in + 1) % capacity; out = (out + 1) % capacity;

As you can see, in (out) only modified by P1 (P2).

I am wondering if there is any possible benefit in which we replace in (out) with an ordinary variable. So that the above code become,

P1 P2

const size_t local_in = (in + 1) % capacity;const size_t local_out = out;
while (local_in == out); while ( in ==local_out ) ;

/* do something */ /* do something */

in = local_in; out = (local_out + 1) % capacity;

Any comment will be appreciate.

Dmitry_Vyukov · ‎07-30-2010

Quoting azru0512

But of course. It is 'data' that synchronizes data transfer between threads.

Producer and consumer should operate on different buffer slots, right? Producer inserts data into buffer[1], and consumer retrieves data from buffer[2], for example.

Then, whatyou mean exactly about "how 'data' is declared"?

Provide complete program code, then we will be able to discuss further. If it would be C, then a piece of code possibly would be enough. But what does "data[in] = d" mean in C++ only God can possibly know.

RafSchietekat · ‎07-30-2010

Dmitriy, Ithink that youdidn't make the link between "data" in the third-last line of #15 and the example code above, where it is a function parameter declared as "char*", otherwise you would probably have responded differently.

IMHO, the read-acquire in "buffer[in] = data;" is irrelevant, just as the read-acquire in "in = (in + 1) % capacity;" (right-hand side), it's the release-store in "in = (in + 1) % capacity;" (assignment) that matters. Throwing in extra releases or acquires only affect performance (adversely, if at all), not semantics (as long as the program is otherwise correct).

(Rephrased in the second person, and I apparentlylost therace with #21.)

azru0512 · ‎07-30-2010

So the example code just work fine, right? I mean work correctly, of course.

RafSchietekat · ‎07-30-2010

"So the example code just work fine, right? I mean work correctly, of course."
The code in #15? No, "data = buffer[out];" does nothing useful. Maybe you mean "*data=*buffer[out];", or "strcpy(data, buffer[out];"? Then yes, probably, unless I missed something.

(Added) Although... let me meditate on it a bit longer. In the meantime, Dmitriy can put it through Relacy. :-)

Dmitry_Vyukov · ‎07-30-2010

Quoting Raf Schietekat

Dmitriy, Ithink that youdidn't make the link between "data" in the third-last line of #15 and the example code above, where it is a function parameter declared as "char*", otherwise you would probably have responded differently.

Ah, youre right. I meant 'buffer', not 'data'.

azru0512 · ‎07-30-2010

Well, bufferwill be declared as follows,

char *buffer[32];

And producer/consumer inserts/retrieves a "pointer" into/from the buffer.

azru0512 · ‎07-30-2010

But what does "buffer[in] = data" mean in C++ only God can possibly know.
(I change the sentence a little bit.)

Just curious, why you said so?

RafSchietekat · ‎07-31-2010

#26 "And producer/consumer inserts/retrieves a "pointer" into/from the buffer. "
As I wrote in #24, the consumer doesn't do anything useful: you must instead return "data" as a return value or as an out parameter (by reference or by pointer), or use the referent inside the consumer.

#27 "Just curious, why you said so?"
C++ has objects...

#24 "(Added) Although... let me meditate on it a bit longer. In the meantime, Dmitriy can put it through Relacy. :-)"
Hmm, so P2 has to read a buffer location (and its referent if that may be reused) before P1 overwrites it (and possibly the referent) on the next round. If we assume a processor that can reorder loads, how is the load forced to occur on time? Is there a load-store associated with "out=(out+1)%capacity;"? I see load-load for reading "out", and store-store for writing its new value, but no load-store, and once "out" is written P1 is free to overwrite the data, right? So does that mean rel_acq (or acq_rel) for the atomic operations, or even sequential consistency? It would also mean that "data" cannot be just returned if its referent can be reused, it must all be treated before "out" is updated. Dmitriy, am I seeing ghosts, or maybe not?

Dmitry_Vyukov · ‎07-31-2010

Quoting azru0512

But what does "buffer[in] = data" mean in C++ only God can possibly know.
(I change the sentence a little bit.)

Just curious, why you said so?

operator[] can be overloaded

operator= can be overloaded

some implicit conversion functions can take place

some fancy temporary helper objects can be created

w/o knowing exact types and their definitions the expression can mean basically everything in this world

Dmitry_Vyukov · ‎07-31-2010

Quoting Raf Schietekat

Dmitriy can put it through Relacy. :-)

I do not have time for that right now, but everybody is free to do that manually.

http://groups.google.com/group/relacy

Dmitry_Vyukov · ‎07-31-2010

Quoting Raf Schietekat

Hmm, so P2 has to read a buffer location (and its referent if that may be reused) before P1 overwrites it (and possibly the referent) on the next round. If we assume a processor that can reorder loads, how is the load forced to occur on time? Is there a load-store associated with "out=(out+1)%capacity;"? I see load-load for reading "out", and store-store for writing its new value, but no load-store, and once "out" is written P1 is free to overwrite the data, right? So does that mean rel_acq (or acq_rel) for the atomic operations, or even sequential consistency? It would also mean that "data" cannot be just returned if its referent can be reused, it must all be treated before "out" is updated. Dmitriy, am I seeing ghosts, or maybe not?

As far as I see everything is Ok here.

If you are thinking in term of bidirectional fences, then acquire=#LoadLoad|#LoadStore, and release=#LoadStore|#StoreStore.

> Hmm, so P2 has to read a buffer location (and its referent if that may be reused) before P1 overwrites it (and possibly the referent) on the next round. If we assume a processor that can reorder loads, how is the load forced to occur on time?

The load happens-before store-release to 'out'. And P1 can overwrite the location only after load-acquire of 'out'.

Is there a load-store associated with "out=(out+1)%capacity;"?

It should be there. It's store-release.

Dmitry_Vyukov · ‎07-31-2010

Quoting azru0512

But in the above producer/consumer example, we see there are some read-acquire in between (e.g., buffer[in] = data;).

So my question is:

Istheabove example OK? And what constraints or rules we should apply on those memory operationsbetween read-acquire/store-release pair?

As far as I see, it's Ok.

Load-acquire in between is superfluous, you may cache 'in' in local variable and do not reload 'in' several times.

As for rules, it's too general question, and the general answer is you can use any operations as far as code stays correct (no data races, intended behavior, etc).

RafSchietekat · ‎07-31-2010

"If you are thinking in term of bidirectional fences, then acquire=#LoadLoad|#LoadStore, and release=#LoadStore|#StoreStore."
Ah, right, thanks... Just ghosts then. :-)

(Added) Well, except for the very real need to also process the referent before updating "out", if the referent's location can be reused.

azru0512 · ‎07-31-2010

Files cannot be download from http://groups.google.com/group/relacy/files right now.

Dmitry_Vyukov · ‎07-31-2010

http://groups.google.com/group/relacy/browse_frm/thread/8a7ecceb5f1bf80b#