NAMD: Highly Parallel Scalable Nanoscale Simulation Powered by SYCL

Rob_Mueller-Albrecht · ‎09-04-2023

In this fifth blog of our series on projects being ported to SYCL* and already actively used by users, researchers, and computer scientists, we stick with molecular dynamics and its use for biomedical discovery.

The Theoretical and Computational Biophysics Group (TCBG) at the University of Illinois at Urbana-Champaign (UIUC) joined the oneAPI Academic Centers of Excellence (CoE) in late 2020 with the goal of applying oneAPI’s heterogeneous programming model and C++ with SYCL to NAMD. Dr. David J Hardy, lead developer of NAMD, is spearheading this effort together with Prof. Emad Tajkhorshid and other researchers.

NAMD is a parallel molecular dynamics code designed for the high-performance simulation of biomolecular systems, particularly tuned for the simulation of very large, cell-scale molecular systems. It is based on highly parallel computation of the forces within a molecule as well as the interactions between neighboring molecules.

Figure 1: Bonded and Non-Bonded Forces in a Molecular System

Its simulation engine enables researchers to look at the molecular interactions that drive cellular processes in the atomic-scale detail.

The open, cross-industry, standards-based, unified, multiarchitecture, multi-vendor programming model that is the oneAPI initiative allows to bring a common set of languages and libraries to the table, supporting heterogeneous computing resources from an ever-growing list of vendors. It is based on the SYCL open standard and provides additional language constructs that improve hardware support. Many of these have found their way into SYCL 2020.

In their article and presentation at IWOCL 2022,

Experiences Porting NAMD to the Data Parallel C++ Programming Model

David J. Hardy, Jaemin Choi, Wei Jiang, and Emad Tajkhorshid

IWOCL'22: International Workshop on OpenCL, May 2022, Article No.: 15, Pages 1–5

Published online 2022 May 10. doi: 10.1145/3529538.3529560

David Hardy and team provided a detailed walkthrough of the techniques employed to enable NAMD for scalable compute across diverse multi-vendor platform architectures, taking advantage of SYCL.

NAMD and COVID-19

The ability of molecular dynamics (MD) simulation software running on high-performance computing (HPC) platforms to calculate nanoscale interactions with high temporal resolution allows applications like NAMD to unlock unique insights into the molecular structure and dynamics of pathogens, revealing disease mechanisms.

When the SARS-CoV-2 virus triggered a pandemic in early 2020, NAMD became a key tool the biomedical research community deployed to understand and combat its outbreak.

University of California San Diego (UCSD) led a large team of scientists to simulate the virus and its spike protein assembly. To enable this work, University of Illinois at Urbana-Champaign (UIUC) NAMD experts assisted in scaling the required simulations.

The lessons learned as part of this effort drove home the importance of having powerful molecular dynamics simulation solutions ready and deployable on a variety of GPU-offload computers as well as multi-node supercomputers at very short notice.

NAMD and the Aurora Supercomputer

The team of Dave Hardy at the UIUCs Theoretical and Computational Biophysics Group took on the challenge of making NAMD scalable across multi-vendor hardware solutions by enabling it for oneAPI and C++ with SYCL. In the process, they also took on the challenge of getting NAMD ready for deployment on the Aurora* Exascale Supercomputer at Argonne National Laboratory in close collaboration with the Argonne Leadership Computing Facility (ALCF) and the Aurora Early Science Program (ESP).

33410D_116_CELS_Aurora Sunspot Photos.jpg

Figure 2: Argonne National Laboratory (ANL) Aurora Sunspot

The porting of NAMD to C++ with SYCL is critical to support its running on ANL Aurora. In addition, the vision of adopting a standards-based framework for heterogeneous programming is that it will usher in an era in which workloads such as NAMD are more easily maintainable, scalable, and deployable across a multitude of current and future distributed compute configurations.

NAMD and SYCL

Keeping all this in mind, the work of adding oneAPI and C++ with SYCL support commenced.

NAMD has been around for over 25 years and was originally developed using C++ and the Charm++* parallel programming framework. It has a substantial user base of over 25,000 registered users. Over time, CUDA* support has been added for GPU offload.

NAMD is all about compute scalability for even the most compute-intensive simulations. It typically scales to hundreds of cores for typical simulations and beyond 1,000,000 cores for the largest simulations.

Porting it to SYCL implies maintaining the overall program structure, translating existing CUDA* streams into SYCL queues, recalibrating the compute workload distribution, and synchronization of parallel kernels between CPU and GPU for optimum performance and architectural flexibility.

The high-level characteristics of SYCL that differentiate it from the CUDA programming model are that SYCL uses modern C++ and defines parallel kernels as unnamed lambda functions that are invoked via device queues and that error handling in SYCL is implemented via C++ try-catch exception handling.

All this really boils down to SYCL being more in line with C++ standards-based programming paradigms and, thus a more natural approach to implementing cross-architectural compute.

Please check out the article on the experiences during the porting effort for more detailed insights into the considerations you may want to be aware of when adding SYCL-based cross-architecture parallelism to complex workloads.

UIUC used the Intel® DPC++ Compatibility Tool for the initial CUDA to SYCL migration.

They used the Intel® oneAPI DPC++ Library (oneDPL) to introduce C++ STL reduce, shuffle, atomic_ref, sort, and scan functionality with offload support, and they lastly also used the Intel® oneAPI Math Kernel Library (oneMKL) FFT domain with SYCL API to replace NVIDIA* GPU proprietary cuFFT calls with a more open and flexible solution.

Bringing Choice to Accelerated Compute

Become part of the effort to make high-performance cross-architecture compute transparent, portable, and flexible. Include SYCL as the accelerator and GPU offload solution in your code path. Adopt oneAPI as the means to implement free from vendor lock.

Get started with NAMD, its source repository, and its Argonne Leadership Computing Facility (ALCF) use on Sunspot.

The Intel® DPC++ Compatibility Tool and the CUDA to C++ with SYCL Migration Portal are the convenient starting point for your own migration to SYCL.

Make SYCL part of your software solution.

Please stay tuned for next week’s oneAPI project focus.

SYCL Resources

NAMD Resources

Notices and Disclaimers

Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex. Results may vary.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. No product or component can be absolutely secure.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.

Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

*Other names and brands may be claimed as the property of others.