Intel® Labs’ Graph Neural Networks Research Featured as Groundbreaking Work in AI

Matthias_Mueller · ‎08-11-2022

Published November 9, 2021

Matthias Mueller leads the Embodied AI Lab at Intel Labs,where he and his team develop algorithms pertaining to the fields of robotics, computer vision, graphics, and machine learning. The goal of Intel Labs’ Embodied AI Lab is to build intelligent systems that understand and perceive the world around them and interact with it.

Highlights:

Intel Labs’ paper, “Training Graph Neural Networks with 1000 Layers,” presented at the 2021 International Conference on Machine Learning (ICML), was featured as one of the most important research areas in the “State of AI Report.”
Research finds that reversible Graph Neural Networks (RevGNNs) significantly outperform existing methods on multiple datasets and improve large models’ memory efficiency for AI applications.

Intel Labs presented several papersat this year’s International Conference on Machine Learning (ICML), the industry’s top conference in machine learning. One of the papers, “Training Graph Neural Networks with 1000 Layers,” gained the attention of two AI investors, Nathan Benaich and Ian Hogarth. Their State of AI Report 2021, which provides an analysis of the most exciting developments in AI, featured the paper and highlighted the promising results of Intel Labs’ Graph Neural Network (GNN) research.

The State of AI Report, now in its fourth year, includes companies and research groups’ most significant industry contributions. It provides information on AI research, talent, industry and business impact, and political and economic implications. This year’s key themes include AI reaching more widespread implementation in national electric grids, automated supermarket warehousing optimization, and healthcare.

Intel Labs’ paper on GNNs was included in the report. The project’s goal is to improve the memory efficiency of GNNs, one of the major bottlenecks of current network architectures. This project was driven by Guohao Li as part of his internship at Intel Labs under the supervision of Matthias Mueller. Bernard Ghanem and Vladlen Koltun also advised the project.

Convolutional neural networks (CNNs) have revolutionized AI with state-of-the-art results in many computer vision tasks where data is structured in a uniform grid of pixels that from images. However, most data that occurs in our world is much better represented by nodes and edges that from a graph, grids being a special case of a graph. GNNs can be used to represent 3D data such as point clouds and meshes, biological data such as molecular structures and protein interactions, social graphs, citation graphs and co-purchasing networks.

Figure 1. State of AI Report 2021.

The paper proposes different approaches for improving the memory and parameter efficiency of existing GNN architectures. The State of AI Report mentions RevGNN-Wide, Intel Labs’ largest model and RevGNN-Deep, the deepest GNN to date with 1,000 layers. The deep reversible architecture (RevGNN) enables the training of GNNs with virtually unlimited depth with the memory cost of a single layer.

This comes at the cost of some computational overhead but also enables models with very large capacity, one order of magnitude larger than existing models, outperforming them on several node property prediction benchmark tasks. The relationship between performance and parameters is best illustrated in the context of language modeling. Recent progress in natural language processing (NLP) has been enabled by a massive increase in parameter counts: GPT (110M), BERT (340M), GPT-3 (175B), Gshard-M4 (600B), and DeepSpeed (1T).

The Open Graph Benchmark (OGB) is a collection of realistic, large-scale, and diverse benchmark datasets for machine learning on graphs. The number of parameters of GNNs compared to language models is tiny, but our work is an important step in the right direction. RevGNN-Wide (68.5M) and RevGNN-Deep (20M) have about one order of magnitude more parameters than existing GNNs and have occupied the #1 and #2 spots on the ogbn-proteins leaderboard for the last five months. While this is a significant development, depth still doesn’t help in some tasks, and future investigation is needed.

Conclusion

RevGNNs open a new path to overparameterized GNNs by overcoming the high GPU memory consumption bottleneck of current deep GNNs. This approach makes training GNNs with an order of magnitude more parameters than current state of the art models possible, which translates to significant performance gains on several graph benchmarks.

Since many real-world applications rely on large graphs, it is no surprise that this work was featured in the State of AI Report 2021. Maybe sometime in the future, a deep RevGNN will help you discover your new favorite movie on Netflix, recommend a product on Amazon that will change your life, and help you find old friends on social media. There are also potential applications of reinforcement learning in combination with RevGNNs to chip design.

While the results are very promising, further work is required to reduce the training time of GNNs and better understand why overparameterized models help on some datasets but not others.

Please read the paper for further details and check out the code.

Training Graph Neural Networks with 1000 Layers

Abstract

Deep graph neural networks (GNNs) have achieved excellent results on various tasks on increasingly large graph datasets with millions of nodes and edges. However, memory complexity has become a major obstacle when training deep GNNs for practical applications due to the immense number of nodes, edges, and intermediate activations.

To improve the scalability of GNNs, prior works propose smart graph sampling or partitioning strategies to train GNNs with a smaller set of nodes or sub-graphs. In this work, we study reversible connections, group convolutions, weight-tying, and equilibrium models to advance the memory and parameter efficiency of GNNs.

We find that reversible connections in combination with deep network architectures enable the training of overparameterized GNNs that significantly outperform existing methods on multiple datasets. Our models RevGNN-Deep (1,001 layers with 80 channels each) and RevGNN-Wide (448 layers with 224 channels each) were both trained on a single commodity GPU and achieved an ROC-AUC of 87.74 ± 0.13 and 88.24 ± 0.15 on the ogbn-proteins dataset. To the best of our knowledge, RevGNN-Deep is the deepest GNN in the literature by one order of magnitude.

Figure 2. ROC-AUC score vs. GPU memory consumption on the ogbn-proteins dataset. We find that deep RevGNNs are very powerful and outperform existing models by a margin; our best models are RevGNN-Deep and RevGNN-Wide. We also compare reversible connections (RevGNN-x), weight-tying (WTx), and equilibrium models (DEQ-x) for 112-layer deep GNNs (x denotes the number of channels per layer). Reversible models consistently achieve the same or better performance as the baseline using only a fraction of the memory. Weight-tied and equilibrium models offer a good performance-to-parameter efficiency trade-off. Datapoint size is proportional to √p, where p is the number of parameters.