Santiago Miret is a deep learning researcher at Intel Labs, where he focuses on developing artificial intelligence (AI) solutions and exploring the intersection of AI and the physical sciences.
- Intel Labs has set several works in motion to further the development and application of advanced artificial intelligence technologies to scientific challenges, particularly in the field of materials science.
- Intel Labs and Intel Accelerated Computing Group (AXG) detail the open-source release of the Open MatSci ML Toolkit to simplify AI model development and augment training capabilities for large-scale datasets.
- The toolkit is showing initial signs of success, including an upcoming spotlight presentation by Intel researchers at the AI for Accelerated Materials Design Workshop at NeurIPS 2022.
- Intel Labs and Intel AXG researchers plan to build upon the toolkit by exploring novel AI-based solutions for materials design challenges while improving the framework's software capabilities.
Developing and applying advanced artificial intelligence (AI) technologies to scientific challenges has become an increasingly meaningful research effort within Intel Labs. In 2022, Intel Labs built upon its existing collaboration with Alán Aspuru-Guzik, announced a new research effort with MILA that focuses on AI for scientific discovery and organized a workshop on AI for Accelerated Materials Design (AI4Mat) at NeurIPS 2022. In addition to actively participating in relevant research activities, Intel Labs is committed to promoting open-source software, following Intel’s Open Ecosystem strategy.
The recent open-source release of the Open MatSci ML Toolkit encompasses scientific machine learning (ML) innovations for the open AI ecosystem, enabling AI researchers to develop, prototype and train advanced deep learning (DL) on materials science problems. As described in the research paper accompanying the release, the goals of the Open MatSci ML toolkit are to:
- Provide ease-of-use for ML researchers and practitioners to develop and apply new DL models to scientific challenges.
- Enable scalable computation across different compute capabilities (laptop, server, cluster) and hardware platforms (CPU, GPU, XPU) to fit the needs of various use cases and availability.
- Extend the ability of AI researchers to apply new frameworks for scientific discovery challenges, including new development platforms and problem formulations.
The broader goal of the Open MatSci ML Toolkit is to be versatile in its applications to advanced AI4Mat science discovery problems. However, the focus of the first release is to facilitate the development of novel DL models on the OpenCatalyst dataset. The OpenCatalyst dataset is a joint research effort by Fundamental AI Research (FAIR) at Meta AI and Carnegie Mellon University's Department of Chemical Engineering and encompasses one of the first large-scale datasets to enable the application of ML techniques. The dataset aims to enable the design of novel catalytic materials, which could then be applied to future clean energy and sustainable agriculture technologies. Specifically, the datasets consist of high-quality and expensive physics-based simulation techniques (e.g., density functional theory), which enable ML researchers to develop advanced DL techniques to approximate such simulations. While the OpenCatalyst dataset mainly focuses on simulations of catalytic materials, the ability to effectively approximate and enhance physical simulations has substantially broader applications in many scientific fields, including semiconductor materials design and manufacturing.
Geometric Deep Learning
In trying to apply ML techniques to approximate physical simulation methods, researchers realized that new methods would be required to mimic the symmetries and constraints present in the natural world. These efforts led to the creation of geometric deep learning (GDL), a field within AI that has recently seen enormous success in this kind of real-world modeling. Historically, it has been difficult for these kinds of models to respect the inherent symmetries and inductive biases that underline physical properties: for example, the mass of an object does not change with its position. Fortunately, GDL has produced a variety of novel DL methods that respect different kinds of symmetries and have proven useful in modeling physical and chemical structures, such as those found in the OpenCatalyst dataset. The Open MatSci ML Toolkit provides a platform for ML researchers and practitioners to continue innovating in the field of GDL and introducing new ways of applying AI to general physical and chemical challenges.
Current Status and Results
Figure 1: Features of the Open MatSci ML Toolkit – Flexible and easy-to-use software design with hardware agnostic experiment scaling and research-friendly graph neural network design using DGL, a major graph neural network framework.
In the research paper accompanying the release of the Open MatSci ML Toolkit, we show promising results in applying various GDL models to the OpenCatalyst dataset. Our study outlines the capability of scaling experiments to the computation capabilities required for the OpenCatalyst dataset while maintaining competitive modeling performance. Given the promise of the toolkit, Intel Labs researchers have been invited to give a spotlight presentation at the AI for Accelerated Materials Design Workshop at NeurIPS 2022 and are continuing to build upon the current version of the toolkit. Our continued research efforts will focus on developing and training new GDL-based AI models that can better approximate chemical properties and behavior. Furthermore, we are working on standardizing performance measurements of our workflow to conform with the original OpenCatalyst leaderboard, which will allow us to better assess how models developed using MatSci ML compare to models developed with the original OpenCatalyst implementation. The GitHub engagement statistics for the open-source repository are already showing active and promising engagement with the ML research community, which will help guide additional developments of the framework in the future.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.