[PNA] Principal Neighbourhood Aggregation for Graph Nets

Time: Wednesday 2-Sept-2020 16:00 (This is a past event.)

Discussion Facilitator:

Artifacts

Motivation / Abstract
Graph Neural Networks (GNNs) have been shown to be effective models for different predictive tasks on graph-structured data. Recent work on their expressive power has focused on isomorphism tasks and countable feature spaces. In this work, the authors focus on a theoretically motivated aggregation function that includes continuous feature spaces. The authors propose Principal Neighbourhood Aggregation (PNA), a novel architecture combining multiple aggregators with degree-scalers (which generalize the sum aggregator). They compare the capacity of different models to capture and exploit the graph structure via a novel benchmark containing multiple tasks taken from classical graph theory, alongside existing benchmarks from real-world domains, all of which demonstrate the strength of their model. This work sheds some light on new aggregation methods which are essential in the search for powerful and robust models.
Questions Discussed
- Limitations of single aggregator functions for various neighbourhood sizes and feature distributions
- The best type of aggregator (hint: the answer is none.)
- Optimally expressive GNNs 
- Training multi-task GNNs 
Key Takeaways
- There is no one-size fits all solution for aggregation; this is a task-specific design choice
- The authors prove that the commonly used sum aggregation is not an injective function when in a continuous feature space and for that reason, sum will not be able to distinguish between neighbourhoods
- PNA introduces a set of aggregation functions that *are* injective
- The aggregation functions include 3 ways to scale the aggregated representations via what is known as degree scalers and include 4 ways to aggregate features by mean, std, max and min
- The different combinations of aggregations and scalers amount to 12*F different aggregated feature representations which then get compressed down to 1*F 
- PNA can be used in combination with any GNN framework as only the aggregator has to be swapped out
Stream Categories:
 Graph Neural Nets