As Intel develops and evolves the oneAPI programming model and Data Parallel C++ (DPC++) compiler for heterogeneous processing architectures, you can use the latest Intel® FPGAs in the Intel® Agilex™ FPGA family to accelerate high-performance computing (HPC) algorithms by as much as two orders of magnitude. A new webinar titled “High-Performance Computing with Next-Generation Intel® Agilex™ FPGAs” will walk you through one such HPC workload, which infers molecular structures from electron cloud configurations. This analysis is important for many scientific investigations including drug discovery and nanotechnology research. The webinar will also go into next-gen technologies, such as PCIe 5.0 or Compute Express Link (CXL) and HBM2e, which are future performance enhancements that you can expect to see on FPGA-based accelerator cards.
The HPC code discussed in this webinar was originally written in OpenCL™ to run on hardware accelerators based on Intel® Arria® 10 and Intel® Stratix® 10 FPGAs. This code was later ported to DPC++, which required very few changes. The porting details will be discussed during this webinar. This port made it possible to greatly simplify the HPC code because many elements that had to be explicitly managed in OpenCL are automatically handled by DPC++. Consequently, the resulting DPC++ host code, developed using the oneAPI programming model, is half the size of its OpenCL counterpart. This DPC++ code was then compiled for execution on an Intel Agilex FPGA incorporated onto a BittWare IA-840f Enterprise-Class FPGA accelerator card.
When run on a single Intel® Xeon® CPU core, this HPC workload processed a 2-million-point data set in two hours and forty minutes. Hardware acceleration using the Intel Arria 10 and Intel Stratix 10 FPGAs resulted in speedups by factors ranging from 17.8X to 118.2X. As impressive as those results are, the DPC++ implementation running on the BittWare IA-840f card with an Intel Agilex FPGA processed the same data set in 61 seconds, which is 157X faster than the same code running on one CPU core. Using 40-bit integers instead of floating-point numbers to represent the data further reduced processing time to 41 seconds – a 233X speedup. This sort of tradeoff—using large integers instead of floating-point numbers—is easily achieved when using DPC++ to accelerate HPC workloads using FPGAs.
The HPC workload acceleration delivered by the Intel Agilex FPGA on the BittWare IA-840f card depended on three major factors: faster clock rates, more parallelism, and faster host I/O. The advanced 10 nm SuperFin technology used to manufacture Intel Agilex FPGAs and the BittWare IA-840f card’s PCIe 4.0 host interface helps to enable the acceleration improvement.
Panelists for this webinar include:
- Christian Stenzel, Technical Sales Specialist – Cloud and Enterprise Acceleration Division, Intel
- Craig Petrie, VP Sales and Marketing, BittWare
- Maurizio Paolini, Field Applications Engineer – Cloud and Enterprise Acceleration Division, Intel
- César González, Barcelona Supercomputing Center, Institute for Advanced Chemistry of Catalonia – CSIC
Two webinars with live questions and answers will be held to accommodate different time zones. The dates and times for the webinar are:
- Tuesday, November 8, 2022: 11am EST/5pm CET
- Thursday, November 10, 2022: 9am CET/4pm SGT
For more information about this HPC application, implementation details, and further information about the hardware used to accelerate this HPC code, register for the webinar.
The webinar will be available on demand later. If you register for the webinar now, you’ll receive the details for on-demand access.
For more information about the BittWare IA-840f accelerator card, see “BittWare IA-840F FPGA Accelerator PCIe Card bristles with high-speed I/O, is based on an Intel Agilex FPGA.”
OpenCL™ and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.