SC24 Experience with and Benefits of oneAPI Application Offload at LRZ*

Rob_Mueller-Albrecht · ‎02-18-2025

A Quick Snapshot from Supercomputing 2024 (SC24)

At SC24 in Atlanta, Dr. Gerald Mathias, Group Lead Application Support at LRZ* ’s SuperMUC-NG, one the world’s largest CPU-based supercomputing systems, took to the stage in Intel’s exhibitor booth theater to share LRZs experience porting and optimizing DPEcho* and GROMACS* codes to GPU architectures using oneAPI with SYCL*.

DPEcho is a C++ reimplementation of the Echo astrophysical magneto-hydrodynamics code.
GROMACS is a molecular dynamics package used for simulating biochemical molecules like proteins, lipids, and nucleic acids.

LRZ’s computing capacity is used by a large community of scientists and developers. Among them, members of the life science community running molecular dynamics codes like Amber* and Gromacs, as well as astrophysicists using DPEcho, a general relativistic magnetohydrodynamics codebase originally written to simulate black hole accretion of pulsars with neighboring neutron stars.

In his talk, Dr. Mathias covers the journey of enabling their DpEcho and Gromacs code base to support Intel® Data Center GPU offload using SYCL* to further drive performance scalability through the introduction of accelerator offload parallelism. One of the projects he and his team worked on as part of their role mentoring scientist deploying large-scale computing projects at the supercomputing center.

Check out the full video of his presentation here:

Leibniz Supercomputing Centre (LRZ): Application Offloading with oneAPI | SC24 | Intel Software

He details the port of DPEcho’s Fortran codebase to C++, the introduction of SYCL, and the identification of opportunities to isolate parallel compute kernels for GPU offload. All this while maintaining MPI cluster compatibility and scalability.

The port of Gromacs to this latest advancement in LRZ’s parallel compute offering followed a similar trajectory.

Along the way, there were a lot of key learnings about the best methods for vectorization and memory management and how a step-by-step approach and reliance on open standards-based parallel programming models benefit performance scalability and performance portability.

I encourage you to check out this very informative video and join Dr Mathias as he maps out the journey to open accelerated compute for high performance computing scientific use cases, the pitfalls and the performance implications.

Join the Open Accelerated Computing Revolution

Accelerated heterogeneous computing is finding its way into every facet of technology development. By embracing openness, you can access an active ecosystem of software developers. The oneAPI specification element source code is available on GitHub.

Let us drive the future of accelerated computing together; become a UXL Foundation member today:

Get Started with oneAPI

If you find the possibilities for accelerated compute discussed here intriguing, check out the latest Intel® oneAPI DPC++/C++ Compiler, either stand-alone or as part of the Intel® oneAPI Base Toolkits, or Intel® oneAPI HPC Toolkit, or toolkit selector essentials packages.

Additional Resources

Gromacs - A free and open-source software suite for high-performance molecular dynamics and output analysis.
DpEcho (GitHub*) - General Relativity with SYCL* for the 2020s and beyond