FPGA Intellectual Property
PCI Express*, Networking and Connectivity, Memory Interfaces, DSP IP, and Video IP
6529 Discussions

Arria 10 GX Dev Kit PCie DMA Timeout on a NUMA HPC

Altera_Forum
Honored Contributor II
1,395 Views

Hello, everyone 

 

I am currently working with FPGA-based acceleration for an HPC cluster, and I am having trouble to communicate with my FPGA boards using DMA on a NUMA environment.  

 

My Host topology is as follows: 

 

 

 

 

 

 

(UPI) 

 

 

 

 

 

DDR0 

<----> 

cpu0 

<---------------> 

cpu1 

<---> 

DDR1 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

arria 10 gx 

<----------> 

arria 10 gx 

 

 

 

 

 

 

dev kit a 

(PCIe) 

dev kit b 

 

 

 

 

 

 

I have two Arria 10 GX Dev Kits Endpoints attached to the first CPU, which works as a PCI Root Complex. I am also using an adaptation of the "altera_dma" driver to send and receive data to and from the FPGAs. Inside the boards I have simply programmed the "pcie dma with external memory (https://www.altera.com/documentation/nik1412547570040.html)" reference design with no extra circuit. The design was synthesized using both Quartus 17.0 and Quartus 18.0, with the same results. The two CPUs are identical (Xeon 6148) with integrated PCIe controllers. 

 

With this configuration, from the Host I am able to read and write any Avalon MM address (through PCIe BAR[4]+offset) using io[write/read] system calls. However, I can't get the DMA controller to transfer any data. Every DMA transaction ends up with a DMA Timeout. 

 

Some extra information: 

 

- The same driver and reference design works perfectly on my workstation, which has a single CPU and FPGA board; 

- The same problem occurs when I use only one FPGA on the HPC Host; 

- Both devices are correctly identified by the driver probe function, and a corresponding "/dev/devicenode" is created for each FPGA. 

 

The configuration that works: 

 

 

 

 

(PCIe) 

 

 

 

DDR 

<----> 

CPU 

<----> 

Arria 10 GX Dev Kit 

 

 

 

 

I believe that the problem might be related to PCI memory allocation on a NUMA environment, hence it is the only different thing from my workstation. 

 

Have you guys experienced any similar problems or are aware of any limitations of the Arria 10 PCIe DMA Hard IP used in the reference design regarding this type of environment? 

If so, could share some information to help me out? 

 

Thanks a lot, you guys! 

 

Regards
0 Kudos
0 Replies
Reply