Principal Neighbourhood Aggregation for Graph Nets
Wednesday Sep 2 2020 16:00 GMT
Please to join the live chat.
Why This Is Interesting
Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. In this work, the authors focus on a theoretically motivated aggregation function that includes continuous feature spaces. The authors propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). They compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of their model. This work sheds some light on new aggregation methods which are essential in the search for powerful and robust models.
Discussion Points
Limitations of single aggregator functions for various neighbourhood sizes and feature distributions
The best type of aggregator (hint: the answer is none.)
Optimally expressive GNNs
Training multi-task GNNs
Takeaways
There is no one-size fits all solution for aggregation; this is a task-specific design choice
The authors prove that the commonly used sum aggregation is not an injective function when in a continuous feature space and for that reason, sum will not be able to distinguish between neighbourhoods
PNA introduces a set of aggregation functions that are injective
The aggregation functions include 3 ways to scale the aggregated representations via what is known as degree scalers and include 4 ways to aggregate features by mean, std, max and min
The different combinations of aggregations and scalers amount to 12F different aggregated feature representations which then get compressed down to 1F
PNA can be used in combination with any GNN framework as only the aggregator has to be swapped out