<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Cache-coherence traffic in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cache-coherence-traffic/m-p/953815#M5229</link>
    <description>&lt;P&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I have come to an interresting subject, in computer science&lt;BR /&gt;we calculate the complexity to give an idea at how good&lt;BR /&gt;or bad is the algorithm, it's the same with Locks you have &lt;BR /&gt;to do some calculation fir Locks algorithms to give an idea at how good or bad is the Lock is with cache-coherence traffic, so follow with me &lt;BR /&gt;please, if you take a look at the source code of my &lt;BR /&gt;scalable distributed fair Lock (you can download the source &lt;BR /&gt;code at:&amp;nbsp; &lt;A href="http://pages.videotron.com/aminer/)" target="_blank"&gt;http://pages.videotron.com/aminer/)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;You will read inside the LW_DFLOCK.pas right on the&amp;nbsp; &lt;BR /&gt;"procedure TDFLOCK.Enter;" you will read this:&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;if ((FCount3^.FCount3=2) and CAS(FCount2.FCount2,1,0)) &lt;BR /&gt;then &lt;BR /&gt;begin&lt;BR /&gt;&amp;nbsp;myobj^.myid:=-1;&lt;BR /&gt;&amp;nbsp;break;&lt;BR /&gt;end;"&lt;BR /&gt;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;If you have noticed if FCount2.FCount2 have changed this &lt;BR /&gt;will generate N&amp;nbsp; (N is the number of threads) cache lines misses &lt;BR /&gt;and cache lines transfers, but you have to be smart please ,&lt;BR /&gt;in a contention scenario since we are looping around: &lt;BR /&gt;&lt;BR /&gt;"if ((FCount3^.FCount3=2) and (FCount1^[myid].FCount1=myobj^.count)) then break;" &lt;BR /&gt;&lt;BR /&gt;and&amp;nbsp; &lt;BR /&gt;&lt;BR /&gt;FCount1^[myid].FCount1 is on the a local cache&lt;BR /&gt;&lt;BR /&gt;that means that the cache-coherence traffic will be reduced to 1 &lt;BR /&gt;cache misse and 1 cache line tranfer every time&amp;nbsp; FCount1^[myid].FCount1 have changed , so this is better than &lt;BR /&gt;the cache-coherence traffic of 1+2+3...N = (N^2+N)/2 or even worse the N+N+N+...N of the spinlock with a backoff or the Ticket spinlock, &lt;BR /&gt;other than that my scalable distributed Lock is Fair and it avoids &lt;BR /&gt;starvation.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 11 Oct 2013 15:19:21 GMT</pubDate>
    <dc:creator>aminer10</dc:creator>
    <dc:date>2013-10-11T15:19:21Z</dc:date>
    <item>
      <title>Cache-coherence traffic</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cache-coherence-traffic/m-p/953815#M5229</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I have come to an interresting subject, in computer science&lt;BR /&gt;we calculate the complexity to give an idea at how good&lt;BR /&gt;or bad is the algorithm, it's the same with Locks you have &lt;BR /&gt;to do some calculation fir Locks algorithms to give an idea at how good or bad is the Lock is with cache-coherence traffic, so follow with me &lt;BR /&gt;please, if you take a look at the source code of my &lt;BR /&gt;scalable distributed fair Lock (you can download the source &lt;BR /&gt;code at:&amp;nbsp; &lt;A href="http://pages.videotron.com/aminer/)" target="_blank"&gt;http://pages.videotron.com/aminer/)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;You will read inside the LW_DFLOCK.pas right on the&amp;nbsp; &lt;BR /&gt;"procedure TDFLOCK.Enter;" you will read this:&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;if ((FCount3^.FCount3=2) and CAS(FCount2.FCount2,1,0)) &lt;BR /&gt;then &lt;BR /&gt;begin&lt;BR /&gt;&amp;nbsp;myobj^.myid:=-1;&lt;BR /&gt;&amp;nbsp;break;&lt;BR /&gt;end;"&lt;BR /&gt;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;If you have noticed if FCount2.FCount2 have changed this &lt;BR /&gt;will generate N&amp;nbsp; (N is the number of threads) cache lines misses &lt;BR /&gt;and cache lines transfers, but you have to be smart please ,&lt;BR /&gt;in a contention scenario since we are looping around: &lt;BR /&gt;&lt;BR /&gt;"if ((FCount3^.FCount3=2) and (FCount1^[myid].FCount1=myobj^.count)) then break;" &lt;BR /&gt;&lt;BR /&gt;and&amp;nbsp; &lt;BR /&gt;&lt;BR /&gt;FCount1^[myid].FCount1 is on the a local cache&lt;BR /&gt;&lt;BR /&gt;that means that the cache-coherence traffic will be reduced to 1 &lt;BR /&gt;cache misse and 1 cache line tranfer every time&amp;nbsp; FCount1^[myid].FCount1 have changed , so this is better than &lt;BR /&gt;the cache-coherence traffic of 1+2+3...N = (N^2+N)/2 or even worse the N+N+N+...N of the spinlock with a backoff or the Ticket spinlock, &lt;BR /&gt;other than that my scalable distributed Lock is Fair and it avoids &lt;BR /&gt;starvation.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Oct 2013 15:19:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cache-coherence-traffic/m-p/953815#M5229</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2013-10-11T15:19:21Z</dc:date>
    </item>
    <item>
      <title>I correct some typos...</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cache-coherence-traffic/m-p/953816#M5230</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I correct some typos...&lt;BR /&gt;&lt;BR /&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I have come to an interresting subject, in computer science&lt;BR /&gt;we calculate the complexity to give an idea at how good&lt;BR /&gt;or bad is the algorithm, it's the same with Locks you have&lt;BR /&gt;to do some calculations with Locks algorithms to give an idea at how good or bad the Lock is with cache-coherence traffic, so follow with me&lt;BR /&gt;please, if you take a look at the source code of my&lt;BR /&gt;scalable distributed fair Lock (you can download the source&lt;BR /&gt;code at:&amp;nbsp; &lt;A href="http://pages.videotron.com/aminer/)" target="_blank"&gt;http://pages.videotron.com/aminer/)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;You will read inside the LW_DFLOCK.pas right on the&lt;BR /&gt;"procedure TDFLOCK.Enter;" you will read this:&lt;BR /&gt;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;if ((FCount3^.FCount3=2) and CAS(FCount2.FCount2,1,0))&lt;BR /&gt;then&lt;BR /&gt;begin&lt;BR /&gt;&amp;nbsp;myobj^.myid:=-1;&lt;BR /&gt;&amp;nbsp;break;&lt;BR /&gt;end;"&lt;BR /&gt;&lt;BR /&gt;==&lt;BR /&gt;&lt;BR /&gt;If you have noticed if FCount2.FCount2 have changed this&lt;BR /&gt;will generate N&amp;nbsp; (N is the number of threads) cache lines misses&lt;BR /&gt;and cache lines transfers, but you have to be smart please ,&lt;BR /&gt;in a contention scenario since we are looping around:&lt;BR /&gt;&lt;BR /&gt;"if ((FCount3^.FCount3=2) and (FCount1^[myid].FCount1=myobj^.count)) then break;"&lt;BR /&gt;&lt;BR /&gt;and&lt;BR /&gt;&lt;BR /&gt;FCount1^[myid].FCount1 is on the a local cache&lt;BR /&gt;&lt;BR /&gt;that means that the cache-coherence traffic will be reduced to 1&lt;BR /&gt;cache misse and 1 cache line tranfer every time&amp;nbsp; FCount1^[myid].FCount1 have changed , so this is better than&lt;BR /&gt;the cache-coherence traffic of 1+2+3...N = (N^2+N)/2 or even worse the N+N+N+...N of the spinlock with a backoff or the Ticket spinlock,&lt;BR /&gt;other than that my scalable distributed Lock is Fair and it avoids&lt;BR /&gt;starvation.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;BR /&gt;Amine Moulay Ramdane.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Oct 2013 15:22:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cache-coherence-traffic/m-p/953816#M5230</guid>
      <dc:creator>aminer10</dc:creator>
      <dc:date>2013-10-11T15:22:43Z</dc:date>
    </item>
  </channel>
</rss>

