<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Cost of IPI (inter-processor interrupt) ? in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867882#M2754</link>
    <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/336004"&gt;Robert Reed (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;I haven't read the paper (either paper) but I have looked at the appropriate systems programming guide which suggests the common uses for IPIs: startup (SIPIs), self interrupting, and propagating interrupts (either interrupt another processor or allow a processor to forward an interrupt to another processor). It seems logical that setting process and network card affinity would increase the likelihood that the processor that first gets the initial NIC interrupt would be able to handle it rather than deferring to another processor. The recommended uses seem very limited though Dmitriy's observation that Win32 provides a function call could mean all kinds of crazies are using it out there.&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;Thank you and others for the answers.&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Wed, 23 Sep 2009 08:16:48 GMT</pubDate>
    <dc:creator>gallus2</dc:creator>
    <dc:date>2009-09-23T08:16:48Z</dc:date>
    <item>
      <title>Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867875#M2747</link>
      <description>Dear forum contributors,&lt;BR /&gt;what is the cost of IPI? As far I know, inter-processor interrupts are used to synchronize cache between cores and processors. Such synchronization can be "costly" (my state of knowledge does not allow me to use precise expressions...). However, what is the cost of IPI itself? Is there anything else, besides the cache synchronization, that can trigger IPI?&lt;BR /&gt;&lt;BR /&gt;Please share some information on this topic.&lt;BR /&gt;</description>
      <pubDate>Fri, 18 Sep 2009 15:01:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867875#M2747</guid>
      <dc:creator>gallus2</dc:creator>
      <dc:date>2009-09-18T15:01:32Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867876#M2748</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/443685"&gt;gallus2&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;what is the cost of IPI? As far I know, inter-processor interrupts are used to synchronize cache between cores and processors. Such synchronization can be "costly" (my state of knowledge does not allow me to use precise expressions...). However, what is the cost of IPI itself? Is there anything else, besides the cache synchronization, that can trigger IPI?&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
Actually, cache synchronization between HW threads rely on one of several cache coherence protocols that do &lt;A href="http://en.wikipedia.org/wiki/Bus_sniffing"&gt;bus sniffing and snooping&lt;/A&gt; to keep trackfor each of the cores which cache lines are valid or in other states of the protocol. As far as I can tell, IPI is limited to some startup activities and other OS or driver-related tasks. So the cost of a HW thread having to synchronize a cache line is on the order of a memory write (assuming the desired memory was cached in a different HW thread), a memory read (to get the data into the desired HW thread) plus a few bus transactions to amortize the cost of the associated snoop traffic. This is a very rough estimate that glosses over a lot of details, which may vary from specific architecture to specific architecture.&lt;BR /&gt;</description>
      <pubDate>Fri, 18 Sep 2009 18:40:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867876#M2748</guid>
      <dc:creator>robert-reed</dc:creator>
      <dc:date>2009-09-18T18:40:31Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867877#M2749</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/336004"&gt;Robert Reed (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; Actually, cache synchronization between HW threads rely on one of several cache coherence protocols that do &lt;A href="http://en.wikipedia.org/wiki/Bus_sniffing"&gt;bus sniffing and snooping&lt;/A&gt; to keep trackfor each of the cores which cache lines are valid or in other states of the protocol. As far as I can tell, IPI is limited to some startup activities and other OS or driver-related tasks. So the cost of a HW thread having to synchronize a cache line is on the order of a memory write (assuming the desired memory was cached in a different HW thread), a memory read (to get the data into the desired HW thread) plus a few bus transactions to amortize the cost of the associated snoop traffic. This is a very rough estimate that glosses over a lot of details, which may vary from specific architecture to specific architecture.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;I was following what wikipedia says:&lt;BR /&gt;&lt;BR /&gt;"An inter-processor interrupt (IPI) is a special type of interrupt by which one processor may interrupt another processor in a multiprocessor system. IPIs are typically used to implement a cache coherency synchronization point.&lt;BR /&gt;&lt;BR /&gt;(...)&lt;BR /&gt;&lt;BR /&gt;In x86 based systems, an IPI synchronizes the cache and Memory Management Unit (MMU) between processors.&lt;BR /&gt;"&lt;BR /&gt;&lt;BR /&gt;Are you sure that IPIs do not have anything to do with cache synchronization?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance.&lt;BR /&gt;Gallus2&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 22 Sep 2009 13:40:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867877#M2749</guid>
      <dc:creator>gallus2</dc:creator>
      <dc:date>2009-09-22T13:40:53Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867878#M2750</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/443685"&gt;gallus2&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;Dear forum contributors,&lt;BR /&gt;what is the cost of IPI? As far I know, inter-processor interrupts are used to synchronize cache between cores and processors. Such synchronization can be "costly" (my state of knowledge does not allow me to use precise expressions...). However, what is the cost of IPI itself? Is there anything else, besides the cache synchronization, that can trigger IPI?&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt; &lt;!-- 		@page { margin: 0.79in } 		P { margin-bottom: 0.08in } 	--&gt;
&lt;P style="margin-bottom: 0in;" lang="en-US"&gt;Cache-coherency protocols do not use IPIs, and as a user-space level developer you do not care about IPIs at all. One is most interested in the cost of cache-coherency itself.&lt;/P&gt;
&lt;P style="margin-bottom: 0in;"&gt;&lt;SPAN lang="en-US"&gt;However, Win32 API provides a function that issues IPIs to all processors (in the affinity mask of the current process)  &lt;/SPAN&gt;FlushProcessWriteBuffers&lt;SPAN lang="en-US"&gt;().  You can use it to investigate the cost of IPIs if you are still interested in them. When I do simple synthetic test on a dual core machine I've obtained following numbers.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin-bottom: 0in;"&gt;&lt;SPAN lang="en-US"&gt;420 cycles is the minimum cost of the function on issuing core.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin-bottom: 0in;"&gt;&lt;SPAN lang="en-US"&gt;1600 cycles is mean cost of the function on issuing core.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin-bottom: 0in;"&gt;&lt;SPAN lang="en-US"&gt;1300 cycles is mean cost of the function on remote core.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P style="margin-bottom: 0in;"&gt;&lt;SPAN lang="en-US"&gt;Note that, as far as I understand, the function issues IPI to remote core, then remote core acks it with another IPI, issuing core waits for ack IPI and then returns.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;BR /&gt;</description>
      <pubDate>Tue, 22 Sep 2009 14:20:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867878#M2750</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2009-09-22T14:20:19Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867879#M2751</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/443685"&gt;gallus2&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; &lt;BR /&gt;I was following what wikipedia says:&lt;BR /&gt;&lt;BR /&gt;"An inter-processor interrupt (IPI) is a special type of interrupt by which one processor may interrupt another processor in a multiprocessor system. IPIs are typically used to implement a cache coherency synchronization point.&lt;BR /&gt;&lt;BR /&gt;(...)&lt;BR /&gt;&lt;BR /&gt;In x86 based systems, an IPI synchronizes the cache and Memory Management Unit (MMU) between processors.&lt;BR /&gt;"&lt;BR /&gt;&lt;BR /&gt;Are you sure that IPIs do not have anything to do with cache synchronization?&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Note "This computer hardware-related article is a stub" at the bottom. It's unclear what is "a &lt;A class="mw-redirect" title="Cache coherency" href="http://en.wikipedia.org/wiki/Cache_coherency"&gt;cache coherency&lt;/A&gt; &lt;A title="Synchronization" href="http://en.wikipedia.org/wiki/Synchronization"&gt;synchronization&lt;/A&gt; point" and what is "to synchronize the cache". To the best of my knowledge there is no such terms.&lt;BR /&gt;Perhaps the author means that with IPI one can ensure instruction ordering on remote processor. This does not directly relate to cache-coherency.&lt;BR /&gt;And I was wrong saying that user-space level developer does not care about IPIs. Because of the above mentioned application (ensure instruction ordering on remote processor) with IPIs one is able to develop algorithms that draw it's strength &lt;SPAN style="color: gray;"&gt;&lt;/SPAN&gt;from dark side of force. As an example you may see following asymmetric reader-writer mutex algorithm which slaughters all other rw mutexes on read-mostly workload:&lt;BR /&gt;&lt;A href="http://groups.google.com/group/lock-free/browse_frm/thread/1efdc652571c6137"&gt;http://groups.google.com/group/lock-free/browse_frm/thread/1efdc652571c6137&lt;/A&gt;&lt;BR /&gt;It may be implemented with other means, but IPIs are preferable because of their "reactivity".&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 22 Sep 2009 14:41:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867879#M2751</guid>
      <dc:creator>Dmitry_Vyukov</dc:creator>
      <dc:date>2009-09-22T14:41:27Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867880#M2752</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/347331"&gt;Dmitriy Vyukov&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; &lt;BR /&gt;&lt;BR /&gt;And I was wrong saying that &lt;STRONG&gt;user-space level developer does not care about IPIs&lt;/STRONG&gt;. Because of the above mentioned &lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;I'm investigating the IPIs because of following paper:&lt;BR /&gt;&lt;A href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1430575" target="_blank"&gt;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1430575&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;(which is a successor of following paper: &lt;A href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1409136)" target="_blank"&gt;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1409136)&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;They stated, that setting process and network card affinity causes performance gain because of:&lt;BR /&gt;- better cache coherency,&lt;BR /&gt;- lower amount of IPIs.&lt;BR /&gt;&lt;BR /&gt;The IPIs have indirect cost of flushing the processor pipeline. Until today I was thinking that most common way of triggering the IPIs (causing pipeline flush) is false sharing on cache. But it turns out that this isn't the truth. Not bad news :)&lt;BR /&gt;&lt;BR /&gt;However, are there any easily abusable ways to trigger the IPIs? It would be good to avoid them, to avoid processor pipeline flushes.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 22 Sep 2009 15:11:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867880#M2752</guid>
      <dc:creator>gallus2</dc:creator>
      <dc:date>2009-09-22T15:11:36Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867881#M2753</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/443685"&gt;gallus2&lt;/A&gt;&lt;EM&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;I'm investigating the IPIs because of following paper:&lt;BR /&gt;&lt;A href="http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1430575" target="_blank"&gt;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1430575&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;They stated, that setting process and network card affinity causes performance gain because of:&lt;BR /&gt;- better cache coherency,&lt;BR /&gt;- lower amount of IPIs.&lt;BR /&gt;&lt;BR /&gt;The IPIs have indirect cost of flushing the processor pipeline. Until today I was thinking that most common way of triggering the IPIs (causing pipeline flush) is false sharing on cache. But it turns out that this isn't the truth. Not bad news :)&lt;BR /&gt;&lt;BR /&gt;However, are there any easily abusable ways to trigger the IPIs? It would be good to avoid them, to avoid processor pipeline flushes.&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;I haven't read the paper (either paper) but I have looked at the appropriate systems programming guide which suggests the common uses for IPIs: startup (SIPIs), self interrupting, and propagating interrupts (either interrupt another processor or allow a processor to forward an interrupt to another processor). It seems logical that setting process and network card affinity would increase the likelihood that the processor that first gets the initial NIC interrupt would be able to handle it rather than deferring to another processor. The recommended uses seem very limited though Dmitriy's observation that Win32 provides a function call could mean all kinds of crazies are using it out there.</description>
      <pubDate>Wed, 23 Sep 2009 01:23:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867881#M2753</guid>
      <dc:creator>robert-reed</dc:creator>
      <dc:date>2009-09-23T01:23:05Z</dc:date>
    </item>
    <item>
      <title>Re: Cost of IPI (inter-processor interrupt) ?</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867882#M2754</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/336004"&gt;Robert Reed (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;I haven't read the paper (either paper) but I have looked at the appropriate systems programming guide which suggests the common uses for IPIs: startup (SIPIs), self interrupting, and propagating interrupts (either interrupt another processor or allow a processor to forward an interrupt to another processor). It seems logical that setting process and network card affinity would increase the likelihood that the processor that first gets the initial NIC interrupt would be able to handle it rather than deferring to another processor. The recommended uses seem very limited though Dmitriy's observation that Win32 provides a function call could mean all kinds of crazies are using it out there.&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;Thank you and others for the answers.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 23 Sep 2009 08:16:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Cost-of-IPI-inter-processor-interrupt/m-p/867882#M2754</guid>
      <dc:creator>gallus2</dc:creator>
      <dc:date>2009-09-23T08:16:48Z</dc:date>
    </item>
  </channel>
</rss>

