Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

PCIe transfers vs core-to-core communication

Hi all,

I have to get data from a card in a PCIe slot to all the cores in my (2 socket sandybridge) system. I am wondering if it would be better to have the card communicate the data directly to all the cores or have it communicate the data only to one core and then have that core do core-to-core communication to forward that data to the remaining cores?

Doing it the firrst way involves several more PCI transactions and doing it the second way relies on the performance of a single-producer-multiple-consumer queue.

Any thoughts on which might be faster?

0 Kudos
1 Reply
New Contributor II
Typically, the core to core latency and bandwidth is orders of magnitude faster than any off chip communication.

In fact if you design your code so that the designated thread which will read the data from the PCI-E can fit in the LLC cache, you can achieve fairly fast data transfer.

However, why isn't a single shared buffer appropriate for your need?
0 Kudos