<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PCIe transfers vs core-to-core communication in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/PCIe-transfers-vs-core-to-core-communication/m-p/784278#M434</link>
    <description>Typically, the core to core latency and bandwidth is orders of magnitude faster than any off chip communication.&lt;BR /&gt;&lt;BR /&gt;In fact if you design your code so that the designated thread which will read the data from the PCI-E can fit in the LLC cache, you can achieve fairly fast data transfer.&lt;BR /&gt;&lt;BR /&gt;However, why isn't a single shared buffer appropriate for your need?</description>
    <pubDate>Thu, 12 Jul 2012 16:47:32 GMT</pubDate>
    <dc:creator>Hussam_Mousa__Intel_</dc:creator>
    <dc:date>2012-07-12T16:47:32Z</dc:date>
    <item>
      <title>PCIe transfers vs core-to-core communication</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/PCIe-transfers-vs-core-to-core-communication/m-p/784277#M433</link>
      <description>Hi all,&lt;BR /&gt;&lt;BR /&gt;I have to get data from a card in a PCIe slot to all the cores in my (2 socket sandybridge) system. I am wondering if it would be better to have the card communicate the data directly to all the cores or have it communicate the data only to one core and then have that core do core-to-core communication to forward that data to the remaining cores? &lt;BR /&gt;&lt;BR /&gt;Doing it the firrst way involves several more PCI transactions and doing it the second way relies on the performance of a  single-producer-multiple-consumer queue.&lt;BR /&gt;&lt;BR /&gt;Any thoughts on which might be faster?&lt;BR /&gt;&lt;BR /&gt;Thanks!</description>
      <pubDate>Mon, 26 Mar 2012 10:12:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/PCIe-transfers-vs-core-to-core-communication/m-p/784277#M433</guid>
      <dc:creator>stardust496</dc:creator>
      <dc:date>2012-03-26T10:12:27Z</dc:date>
    </item>
    <item>
      <title>PCIe transfers vs core-to-core communication</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/PCIe-transfers-vs-core-to-core-communication/m-p/784278#M434</link>
      <description>Typically, the core to core latency and bandwidth is orders of magnitude faster than any off chip communication.&lt;BR /&gt;&lt;BR /&gt;In fact if you design your code so that the designated thread which will read the data from the PCI-E can fit in the LLC cache, you can achieve fairly fast data transfer.&lt;BR /&gt;&lt;BR /&gt;However, why isn't a single shared buffer appropriate for your need?</description>
      <pubDate>Thu, 12 Jul 2012 16:47:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/PCIe-transfers-vs-core-to-core-communication/m-p/784278#M434</guid>
      <dc:creator>Hussam_Mousa__Intel_</dc:creator>
      <dc:date>2012-07-12T16:47:32Z</dc:date>
    </item>
  </channel>
</rss>

