Recipe for a General, Powerful, Scalable Graph Transformer | Ladislav Rampášek

Описание к видео Recipe for a General, Powerful, Scalable Graph Transformer | Ladislav Rampášek

Join the Learning on Graphs and Geometry Reading Group: https://hannes-stark.com/logag-readin...

Paper “Recipe for a General, Powerful, Scalable Graph Transformer": https://arxiv.org/abs/2205.12454

Abstract: We propose a recipe on how to build a general, powerful, scalable (GPS) graph Transformer with linear complexity and state-of-the-art results on a diverse set of benchmarks. Graph Transformers (GTs) have gained popularity in the field of graph representation learning with a variety of recent publications but they lack a common foundation about what constitutes a good positional or structural encoding, and what differentiates them. In this paper, we summarize the different types of encodings with a clearer definition and categorize them as being local, global or relative. Further, GTs remain constrained to small graphs with few hundred nodes, and we propose the first architecture with a complexity linear to the number of nodes and edges O(N+E) by decoupling the local real-edge aggregation from the fully-connected Transformer. We argue that this decoupling does not negatively affect the expressivity, with our architecture being a universal function approximator for graphs. Our GPS recipe consists of choosing 3 main ingredients: (i) positional/structural encoding, (ii) local message-passing mechanism, and (iii) global attention mechanism. We build and open-source a modular framework GraphGPS that supports multiple types of encodings and that provides efficiency and scalability both in small and large graphs. We test our architecture on 11 benchmarks and show very competitive results on all of them, show-casing the empirical benefits gained by the modularity and the combination of different strategies.

Authors: Ladislav Rampášek, Mikhail Galkin, Vijay Prakash Dwivedi, Anh Tuan Luu, Guy Wolf, Dominique Beaini

Twitter Hannes:   / hannesstaerk  
Twitter Dominique:   / dom_beaini  
Twitter Valence Discovery:   / valence_ai  

Reading Group Slack: https://join.slack.com/t/logag/shared...

~

Chapters

00:00 - Intro
01:08 - Overview
02:15 - Message Passing vs. Graph Transformers
04:12 - Pros and Cons of Transformers on Graphs
08:10 - Positional and Structural Encodings
26:47 - The GraphGPS Framework
57:49 - Results & Discussion
1:14:40 - Ablation - Global Attention and MPNN Layers
1:1836 - Ablation - Positional/Structural Encodings
1:23:40 - Conclusion and Summary
1:26:45 - Q&A
1:33:13 - Long Range Graph Benchmark

Комментарии

Информация по комментариям в разработке