Connect with Intel® experts on FPGAs and Programmable Solutions
213 Discussions

Arkville PCIe Gen4 Data Mover Using Intel® Agilex™ FPGAs Webinar: Tuesday 14 Dec 2021 @11:00 am EST

0 1 1,101

I’ve said it before, and I’ll say it again -- today’s computational workloads are larger, more complex, and more diverse than ever before. The explosion of applications like high-performance computing (HPC), artificial intelligence (AI), machine vision, analytics, and other specialized tasks is driving the exponential growth of data.

Processing this data requires humongous amounts of computational power. Many of these data processing tasks can be dramatically speeded if implemented in a massively parallel fashion using the programmable fabric inside an FPGA. A common way to implement this hardware acceleration is to augment the X86 CPU-based motherboard in a workstation or server with an FPGA-based acceleration card connected via a PCIe interface. Examples of such cards are the IA-420F (low-profile) and the IA-840F (double-width) accelerator cards from BittWare, both of which are powered by Intel Agilex F-Series FPGAs.

A major consideration is to convey the source data from the host computer to the FPGA for processing, and to return the result data from the FPGA back to the host at the highest possible speed. Many new socket opportunities require PCIe Gen4 x16 throughput between the host and the FPGA. Intel Agilex F-Series FPGAs support PCIe Gen4 x16 with a theoretical bandwidth of up to 220 Gpbs, but traditional interrupt-driven processing on the host side makes it impossible to achieve this bandwidth.

In order to access the maximum bandwidth possible, one part of the solution is provided by the Data Plane Development Kit (DPDK), which is an open-source software project managed by the Linux Foundation. The DPDK provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading can achieve higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.

Another part of the solution is the Arkville DPDK IP Core from Atomic Rules. This “data mover” core, which is implemented in the FPGA, provides a high throughput line-rate agnostic conduit between the general-purpose processor (GPP) software on the host and the FPGA hardware on the accelerator card using industry-standard DPDK interfaces on the software API/ABI side and AXI interfaces on the FPGA side.

Bittware 14Dec Webinar Blog - Arkville image.png


Arkville provides a DPDK packet conduit (Image source: Atomic Rules)


The Arkville DPDK IP Core was recently updated (rev. 21.11) to support Intel Agilex FPGAs, including those powering BittWare’s latest IA-series of products. The Arkville core moves data at up to 220 Gbps over PCIe Gen4 x16.

All of this will be discussed in detail in the Arkville PCIe Gen4 Data Mover Using Intel® Agilex™ FPGAs Webinar, which will be held on Tuesday 14 Dec 2021 @11:00 am EST. In this webinar you’ll hear from Jeff Milrod at BittWare who will be introducing products supporting Intel Agilex FPGAs and the use of data mover IP in a variety of markets. Tom Schulte from Intel will provide perspective on the Agilex product line, including future features such as PCIe Gen5 x16 support. The webinar will conclude with Shep Siegel from Atomic Rules who will give a demo and explain the performance achieved with the Arkville data mover IP on Agilex FPGAs. Shep will also provide insight into how Arkville reduces time-to-market and makes development easier without sacrificing performance.

Register today to gain access to this event, including a live Q&A session with our presenters. As part of this, you will also have the ability to watch the recording of the webinar on demand.


1 Comment


It was a pleasure being part of this webinar a month ago.

Since then, that single BittWare IA-840F card, with your Agilex F-Series FPGA, has reliably moved about 100 PB (Peta Bytes) between the FPGA and the host Xeon user-space memory. While there is no reason to suspect that the results we presented over our observation window of minutes could not be sustained continuously and indefinitely; we thought it would be a good experiment to do so. No surprises: running Arkville on Agilex F-Series 24/7 produces the same throughput in the long-term as it does in the short.

It's notable and commendable that the Agilex F-Series device on the BittWare IA-840F card, at Gen4x16, even on a Dell R750 server's riser card, has a low-enough BER that in normal observation windows there are *zero* PCIe TLP replays either way. Kudos to Intel, BittWare, and Dell SI teams.

Shep Siegel, CTO