<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic consistent but stale reads in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783607#M290</link>
    <description>After looking at my code, it would be advisable to add a flag to the YourWritersData struct to indicate that the reader is done with the buffer which can also be used by the writer to indicate that the buffer is full of new data.&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;struct YourWritersData&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt; volatile bool WaitingForConsumer;&lt;BR /&gt; YourWritersData() {WaitingForConsumer = false;}&lt;BR /&gt; ...&lt;BR /&gt;};&lt;/P&gt;&lt;BR /&gt;YourWritersData*getBuffer() {&lt;BR /&gt; if(RingBuffer[nextFillIndex].WaitingForConsumer)&lt;BR /&gt; return NULL; // buffer overrun&lt;BR /&gt; return &amp;amp;RingBuffer[nextFillIndex % YourRingBufferSize]; } // no advance!!!&lt;BR /&gt;&lt;BR /&gt; // indicate buffer filled in, get next buffer&lt;BR /&gt; // called by single writer&lt;BR /&gt;YourWritersData*nextBuffer() {&lt;BR /&gt; RingBuffer[nextFillIndex].WaitingForConsumer= true;&lt;BR /&gt; nextFillIndex = (nextFillIndex + 1) % YourRingBufferSize; //advance after fill complete&lt;BR /&gt; return getBuffer(); } // return buffer* orNULL if buffer overrun&lt;BR /&gt;&lt;BR /&gt; // pop item from buffer, return NULL if empty&lt;BR /&gt; // called by multiple readers&lt;BR /&gt; YourWritersData*pop() {&lt;BR /&gt; for(;;) {&lt;BR /&gt; if(nextFillIndex == (nextEmptyIndex % YourRingBufferSize))&lt;BR /&gt; return NULL;&lt;BR /&gt;long copyNextEmptyIndex = nextEmptyIndex;&lt;BR /&gt; if(InterlockedCompareExchange(&lt;BR /&gt; &amp;amp;nextEmptyIndex, // location&lt;BR /&gt; nextEmptyIndex + 1, // exchange*** NOT % YourRingBufferSize&lt;BR /&gt;copyNextEmptyIndex ) == copyNextEmptyIndex)&lt;BR /&gt; return &amp;amp;RingBuffer[copyNextEmptyIndex % YourRingBufferSize];&lt;BR /&gt; } // for(;;)&lt;BR /&gt; } // pop&lt;BR /&gt;&lt;BR /&gt; void ReaderDone(YourWritersData* b) { b-&amp;gt;WaitingForConsumer = false; }&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
    <pubDate>Thu, 04 Aug 2011 17:28:06 GMT</pubDate>
    <dc:creator>jimdempseyatthecove</dc:creator>
    <dc:date>2011-08-04T17:28:06Z</dc:date>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783603#M286</link>
      <description>Hi&lt;DIV&gt;I m wondering in the below sequence if Core 2 Socket 3 can return stale value of x = 10 much after x = 20 has happened on the other core. Nothing in this sequence seems to violate TLO-CC (from Rick's youtube presentation)&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I understand with MESI like coherence protocols there is a single write owner but wondering if processors have internal optimizations that cheat to return older reads as long as causality and total lock order is not violated.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards&lt;/DIV&gt;&lt;DIV&gt;Banks&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV id="_mcePaste"&gt;=====================&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;Core 0 Socket 0&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;=====================&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;store(x, 10)&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;mfence&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;store(y, 1)&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;mfence&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;......&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;store(x, 20)&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;mfence&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;store(y, 2)&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;mfence&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;=====================&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;Core 2 Socket 3&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;=====================&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;r0 = load(x) // returns 10&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;r1 = load(y) // return 1&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt; ... consistent but stale. Is this possible?&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;DIV id="_mcePaste"&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sun, 24 Jul 2011 20:24:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783603#M286</guid>
      <dc:creator>bank_kus</dc:creator>
      <dc:date>2011-07-24T20:24:58Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783604#M287</link>
      <description>How are you asserting your assumptions about time phasing between cores/sockets.&lt;BR /&gt;You cannot rely on time of break point to assert time phase between processors.&lt;BR /&gt;Debug break is not instanteanously amongst processors.&lt;BR /&gt;&lt;BR /&gt;when r0 (x) returns 10, r1 (y) could see:&lt;BR /&gt;&lt;BR /&gt; a) unknown state prior to store(y,1)&lt;BR /&gt; b) 1 after store(y,1) and prior to store(y,2)&lt;BR /&gt; c) 2 after store(y,2) and preceeding subsequent store(y,??)&lt;BR /&gt; d) ?? after store(y,??) following store(y,2) above&lt;BR /&gt;&lt;BR /&gt;Allfour returns are consistent - and not necessarily stale.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
      <pubDate>Thu, 28 Jul 2011 15:38:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783604#M287</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-07-28T15:38:55Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783605#M288</link>
      <description>Hi Jim&lt;DIV&gt;I asked in the context of SEQLOCKs but perhaps I should have asked this instead of trying to draw a similar example.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;SEQLOCK:&lt;/DIV&gt;&lt;DIV&gt;=====&lt;/DIV&gt;&lt;DIV&gt;Writer:&lt;/DIV&gt;&lt;DIV&gt;=====&lt;/DIV&gt;&lt;DIV&gt;mutex_lock&lt;/DIV&gt;&lt;DIV&gt;incr_version  // WRITER in, reader please don't advance if you see this (is_odd test)&lt;/DIV&gt;&lt;DIV&gt;mfence&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;..... modify data&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;incr_version&lt;/DIV&gt;&lt;DIV&gt;mfence&lt;/DIV&gt;&lt;DIV&gt;mutex_unlock&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;======&lt;/DIV&gt;&lt;DIV&gt;Reader:&lt;/DIV&gt;&lt;DIV&gt;======&lt;/DIV&gt;&lt;DIV&gt;do {&lt;/DIV&gt;&lt;DIV&gt; v0 = getversion()&lt;/DIV&gt;&lt;DIV&gt; if (v0 is odd) continue;&lt;/DIV&gt;&lt;DIV&gt;   d = snapshot_data&lt;/DIV&gt;&lt;DIV&gt; v1 = getversion()&lt;/DIV&gt;&lt;DIV&gt;} (while v0 != v1)&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;queue(d) // to a queue protected by a lock&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;---------------&lt;/DIV&gt;&lt;DIV&gt;Lets put some numbers version = 0, data = 20&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Writer(0)  Version = 1 ; data = 30; Version = 2&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Writer(1)  Version = 3 ; data = 50; Version = 4&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;META http-equiv="content-type" content="text/html; charset=utf-8" /&gt;Writer(2)  Version = 5 ; data = 70; Version = 6&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;.....&lt;/DIV&gt;&lt;DIV&gt;Question:&lt;/DIV&gt;&lt;DIV&gt;Can I expect the queue to contain monotonous non decreasing values assuming several readers are running on separate cores?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards&lt;/DIV&gt;&lt;DIV&gt;Banks&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 04 Aug 2011 03:42:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783605#M288</guid>
      <dc:creator>bank_kus</dc:creator>
      <dc:date>2011-08-04T03:42:40Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783606#M289</link>
      <description>The way your write loop is writtena reader could observe:&lt;BR /&gt;&lt;BR /&gt;Version = 1, data = 50, Version = 6 (or longer interval)&lt;BR /&gt;(@Writer 0+), (@Writer 1+), (@Writer 2+)&lt;BR /&gt;&lt;BR /&gt;As well as the snapshot_data being a blurr of writer versions.&lt;BR /&gt;&lt;BR /&gt;The IA32 and Intel64 processors will assure that the write sequences by one core/hw tread/processorare observed by other core/hw tread/processoreither in order or simulteanous w/o mfence (i.e. write combined to same cache line).&lt;BR /&gt;&lt;BR /&gt;If you have one writer and multiple readers then this is called Single Producer Multi-Consumer (SPMC). It makes little sense to use SPMC if the "work" produced by your consumers is solely to insert into a locked queued. You did not indicate that the Consumers queue is to be ordered or not. If unordered and queue insertion is intermittantly relatively long then the MC could buffer copies of the writer's data through the intermittant delay period.&lt;BR /&gt;&lt;BR /&gt;*** However, as you sketched the code, the Writer is oblivious as to if a reader has captured the data.&lt;BR /&gt;&lt;BR /&gt;A better route to take (there are several routes) might be for the writer to use a ring buffer.&lt;BR /&gt;&lt;BR /&gt;struct YourWritersData&lt;BR /&gt;{&lt;BR /&gt; ...&lt;BR /&gt;};&lt;BR /&gt;&lt;BR /&gt;struct YourWritersRingBuffer_SPMC&lt;BR /&gt;{&lt;BR /&gt; volatile long nextFillIndex;&lt;BR /&gt; YourWritersData RingBuffer[YourRingBufferSize];&lt;BR /&gt; volatile long nextEmptyIndex; // assumes RingBuffer larger than cache line&lt;BR /&gt; YourWritersRingBuffer_SPMC() { nextFillIndex = nextEmptyIndex = 0; }&lt;BR /&gt;&lt;BR /&gt; // get pointer to nextfill buffer, returns null on buffer overrun&lt;BR /&gt; // called by single writer&lt;BR /&gt;YourWritersData*getBuffer() {&lt;BR /&gt; if(nextFillIndex + 1 == (nextEmptyIndex % YourRingBufferSize))&lt;BR /&gt; return NULL; // buffer overrun&lt;BR /&gt; return &amp;amp;RingBuffer[nextFillIndex % YourRingBufferSize]; } // no advance!!!&lt;BR /&gt;&lt;BR /&gt; // indicate buffer filled in, get next buffer&lt;BR /&gt; // called by single writer&lt;BR /&gt;YourWritersData*nextBuffer() {&lt;BR /&gt; nextFillIndex = (nextFillIndex + 1) % YourRingBufferSize; //advance after fill complete&lt;BR /&gt; return getBuffer(); } // return buffer* orNULL if buffer overrun&lt;BR /&gt;&lt;BR /&gt; // pop item from buffer, return NULL if empty&lt;BR /&gt; // called by multiple readers&lt;BR /&gt; YourWritersData*pop() {&lt;BR /&gt; for(;;) {&lt;BR /&gt; if(nextFillIndex == (nextEmptyIndex % YourRingBufferSize))&lt;BR /&gt; return NULL;&lt;BR /&gt;long copyNextEmptyIndex = nextEmptyIndex;&lt;BR /&gt; if(InterlockedCompareExchange(&lt;BR /&gt; &amp;amp;nextEmptyIndex, // location&lt;BR /&gt; nextEmptyIndex + 1, // exchange*** NOT % YourRingBufferSize&lt;BR /&gt;copyNextEmptyIndex ) == copyNextEmptyIndex)&lt;BR /&gt; return &amp;amp;RingBuffer[copyNextEmptyIndex % YourRingBufferSize];&lt;BR /&gt; } // for(;;)&lt;BR /&gt; } // pop&lt;BR /&gt;};&lt;BR /&gt;&lt;BR /&gt;Note, the modulus usage of the nextEmptyIndex (and copy). This provides a practical protection against a consumer thread being preempted for a duration of interviening fill/empty cycles of being a multiple of YourRingBufferSize. On IA32 the preemption period would have to last 4 billion such insertions before possible adverse situation (over a day of premption @ 1us/insertion). This should not occue unless consumer thread has:&lt;BR /&gt;&lt;BR /&gt; crashed&lt;BR /&gt; is waiting at prompt&lt;BR /&gt; is in debug break point&lt;BR /&gt; O/S is severely overloaded&lt;BR /&gt;&lt;BR /&gt;If you must have higher protection then use a 64-bit value for the nextEmptyIndex, its copy and the 64-bit InterlockedCompareExchange (4 billion days at 1us insertion rate, 4 million days at 1ns/insertion rate).&lt;BR /&gt;&lt;BR /&gt;There are other strategies for avoiding the dequeue Interlocked... but these may have there own set of issues with respect to thread pre-emption latencies (so does your writer thread unless it is a dedicated processor).&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
      <pubDate>Thu, 04 Aug 2011 17:10:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783606#M289</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-08-04T17:10:37Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783607#M290</link>
      <description>After looking at my code, it would be advisable to add a flag to the YourWritersData struct to indicate that the reader is done with the buffer which can also be used by the writer to indicate that the buffer is full of new data.&lt;BR /&gt;&lt;BR /&gt;&lt;P&gt;struct YourWritersData&lt;/P&gt;&lt;P&gt;{&lt;BR /&gt; volatile bool WaitingForConsumer;&lt;BR /&gt; YourWritersData() {WaitingForConsumer = false;}&lt;BR /&gt; ...&lt;BR /&gt;};&lt;/P&gt;&lt;BR /&gt;YourWritersData*getBuffer() {&lt;BR /&gt; if(RingBuffer[nextFillIndex].WaitingForConsumer)&lt;BR /&gt; return NULL; // buffer overrun&lt;BR /&gt; return &amp;amp;RingBuffer[nextFillIndex % YourRingBufferSize]; } // no advance!!!&lt;BR /&gt;&lt;BR /&gt; // indicate buffer filled in, get next buffer&lt;BR /&gt; // called by single writer&lt;BR /&gt;YourWritersData*nextBuffer() {&lt;BR /&gt; RingBuffer[nextFillIndex].WaitingForConsumer= true;&lt;BR /&gt; nextFillIndex = (nextFillIndex + 1) % YourRingBufferSize; //advance after fill complete&lt;BR /&gt; return getBuffer(); } // return buffer* orNULL if buffer overrun&lt;BR /&gt;&lt;BR /&gt; // pop item from buffer, return NULL if empty&lt;BR /&gt; // called by multiple readers&lt;BR /&gt; YourWritersData*pop() {&lt;BR /&gt; for(;;) {&lt;BR /&gt; if(nextFillIndex == (nextEmptyIndex % YourRingBufferSize))&lt;BR /&gt; return NULL;&lt;BR /&gt;long copyNextEmptyIndex = nextEmptyIndex;&lt;BR /&gt; if(InterlockedCompareExchange(&lt;BR /&gt; &amp;amp;nextEmptyIndex, // location&lt;BR /&gt; nextEmptyIndex + 1, // exchange*** NOT % YourRingBufferSize&lt;BR /&gt;copyNextEmptyIndex ) == copyNextEmptyIndex)&lt;BR /&gt; return &amp;amp;RingBuffer[copyNextEmptyIndex % YourRingBufferSize];&lt;BR /&gt; } // for(;;)&lt;BR /&gt; } // pop&lt;BR /&gt;&lt;BR /&gt; void ReaderDone(YourWritersData* b) { b-&amp;gt;WaitingForConsumer = false; }&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey</description>
      <pubDate>Thu, 04 Aug 2011 17:28:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783607#M290</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-08-04T17:28:06Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783608#M291</link>
      <description>&lt;DIV&gt;I think we're completely missing the point here and I m honestly asking this of Intel architects and hardware engineers.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Does MESI truly guarantee all reads to the same address return the very last Write or can there be optimizations such that second last write is returned. The definition of last write should be fairly modest? The last core to own that cacheline in the "E" state?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I m not trying to find a good way to do MPSC or SPMC rather asking if SEQLOCKS implemented as they are today, can suffer from returning stale but consistent reads.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards&lt;/DIV&gt;&lt;DIV&gt;Banks&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 04 Aug 2011 23:03:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783608#M291</guid>
      <dc:creator>bank_kus</dc:creator>
      <dc:date>2011-08-04T23:03:27Z</dc:date>
    </item>
    <item>
      <title>consistent but stale reads</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783609#M292</link>
      <description>&amp;gt;&amp;gt;such that &lt;STRONG&gt;&lt;EM&gt;second last&lt;/EM&gt;&lt;/STRONG&gt; write is returned. &lt;BR /&gt;&lt;BR /&gt;I will assume you meant &lt;STRONG&gt;&lt;EM&gt;second to last&lt;/EM&gt;&lt;/STRONG&gt; write&lt;BR /&gt;&lt;BR /&gt;Writer Reader 1 Reader 2&lt;BR /&gt;&lt;BR /&gt;write 0&lt;BR /&gt; read 0 Read 0&lt;BR /&gt; observe 0 (interrupt)&lt;BR /&gt;write 1&lt;BR /&gt; read 1&lt;BR /&gt;write 2&lt;BR /&gt; observe 1&lt;BR /&gt;write 3&lt;BR /&gt; observe 0&lt;BR /&gt;----&lt;BR /&gt;So yes, you can observe what you read in prior write states (as well as current state)&lt;BR /&gt;---------------------------------&lt;BR /&gt;near-simulteaneous&lt;BR /&gt;&lt;BR /&gt;write 0&lt;BR /&gt;write 1read 0&lt;BR /&gt;&lt;BR /&gt;where the read occures immediately prior to cach invalidation (in the readers cach system)&lt;BR /&gt;which may occure after the fetch of the write instruction which occures prior to the write to cache/RAM/cache eviction on other cores/processors. This makes it subjective as to what is defined as being first.&lt;BR /&gt;&lt;BR /&gt;What is assured is:&lt;BR /&gt;&lt;BR /&gt;Writer Reader&lt;BR /&gt;x = 0&lt;BR /&gt;(start) (start)&lt;BR /&gt;loop: loop:&lt;BR /&gt;inc x x0 = x&lt;BR /&gt;if(x&lt;XMAX&gt;&lt;/XMAX&gt; goto loopassert(x0&amp;lt;=x1)&lt;BR /&gt; if(x1&lt;XMAX&gt;&lt;/XMAX&gt; goto loop&lt;BR /&gt;&lt;BR /&gt;The reader assert should never trigger. You can insert in the reader whatever you want in between the sample of x0 and x1 provided you do not modify x.&lt;BR /&gt;&lt;BR /&gt;*** Note, the above represents the generated assembly code and is not representative of the source code which may optimize variables to register and/or reorder instructions.&lt;BR /&gt;&lt;BR /&gt;The reader may observe: x0==x1, x0==(x1-1),... x0==(x1-n) i.e. x0&amp;lt;=x1&lt;BR /&gt;It should never observe x0&amp;gt;x1&lt;BR /&gt;&lt;BR /&gt;This can be stated as the read sequence order (ascending) follows the write sequence order (ascending) although the observed sequences are not necessarily the same (writer 1,2,3,4,5..., reader 1,3,6...)&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 05 Aug 2011 14:52:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/consistent-but-stale-reads/m-p/783609#M292</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2011-08-05T14:52:47Z</dc:date>
    </item>
  </channel>
</rss>

