Ginkgo is a high-performance open-source C++ library for numerical linear algebra on multi- and many-core systems. It currently supports GPU kernels implementations in CUDA*, HIP* and Intel® oneAPI-compliant Data Parallel C++ with SYCL*. It is a community-driven project developed by the Karlsruhe Institute of Technology (KIT), the University of Tennessee, and Universitat Jaume I under the modified BSD (Berkeley Software Distribution) license.
A recent case study talks about how porting Gingko’s linear algebra functionality to the SYCL ecosystem through oneAPI allows scientists from various research domains to run the Ginkgo workloads on the latest Intel architectures, including Intel® Iris® Xe Graphics and Intel® Data Center GPU Max Series. It also enables multiarchitecture, cross-vendor programming with the framework. Integrating Ginkgo with the Intel® DPC++ Compatibility Tool for automated CUDA-to-SYCL migration and the Intel® oneAPI DPC++/C++ Compiler increases its platform portability.
By porting Gingko to SYCL and Intel GPUs, it delivers performance improvements for key building blocks of scientific numerical simulations on Intel Data Center GPU Max 1550:
- Gingko’s Sparse Matrix-Vector (SpMV), on average, performs 2x [1] better than Intel® oneAPI Math Kernel Library (oneMKL)’s compressed sparse row (CSR) matrix-vector implementation. The performance shoot can even reach 100x [1] for problems from the SuiteSparse Matrix Collection.
- Gingko’s batched iterative solvers on Intel Data Center GPU Max 1550 2s, on average, run
3.1x and 2.4x faster [2] than on NVIDIA* A100 and H100 GPUs respectively. Similarly, Intel Data Center GPU Max 1550 1s outperforms A100 and H100 GPUs by an average factor of 1.7 and 1.3 [2] respectively.
Moreover, Ginkgo’s optimizations help accelerate and scale OpenFOAM* simulation framework on a node of 6 Intel Data Center GPU Max 1550 devices. The Ginkgo team leveraged Intel® Developer Cloud for early-stage access to the Intel hardware.
Check out the complete Ginkgo success story.
What’s Next?
We encourage you to explore the Intel® Software Development Products (tools, libraries and frameworks) powered by oneAPI. Learn about AI, HPC and Rendering tools in oneAPI-powered software portfolio for accelerated, multiarchitecture supported and vendor-agnostic computing.
Utilize Intel Developer Cloud platform – a sandbox to try your hands on oneAPI’s software optimizations on the latest Intel hardware for accelerated AI, HPC and edge computing workloads.
Additional Resources
- Video: Benefits of oneMKL for Optimized Math Functions on CPUs and GPUs or YouTube link
- Migrate from CUDA to C++ with SYCL portal
- Easy CUDA to SYCL Migration
- oneAPI SYCL C++ code samples
- Intel® Data Center GPU Max Series
References
[1] See the “No Transistor Left Behind: Performance boost using Intel GPUs for Fast Sparse Matrix-Vector (SpMV) Products” section of the original article here.
[2] See the “Batched Iterative Solvers for Selected Applications” section of the original article here.
Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.