SAGPool (Self-Attention Graph Pooling) is a graph pooling method that uses graph convolution to compute topology-aware attention scores for each node, then retains only the top-scoring nodes to produce a coarsened graph — improving upon simple TopKPool by incorporating neighborhood structure into the importance scoring, so that a node's retention depends not just on its own features but on its structural context within the graph.
What Is SAGPool?
- Definition: SAGPool (Lee et al., 2019) computes node importance scores using a Graph Convolution layer: $mathbf{z} = sigma( ilde{D}^{-1/2} ilde{A} ilde{D}^{-1/2} X Theta_{att})$, where $Theta_{att} in mathbb{R}^{d imes 1}$ is a learnable attention vector and $mathbf{z} in mathbb{R}^N$ gives each node a scalar importance score that incorporates both its own features and its neighbors' features. The top-$k$ nodes (by score) are retained: $ ext{idx} = ext{top-}k(mathbf{z}, lceil rN ceil)$ where $r in (0, 1]$ is the pooling ratio. The coarsened graph uses the induced subgraph on the retained nodes with gated features: $X' = X_{ ext{idx}} odot sigma(mathbf{z}_{ ext{idx}})$.
- Topology-Aware Scoring: The key difference from TopKPool (which uses a simple linear projection $mathbf{z} = Xmathbf{p}$ without graph convolution) is that SAGPool's scores are computed after message passing — a node surrounded by important neighbors receives a higher score even if its own features are unremarkable. This prevents important structural bridges from being dropped.
- Feature Gating: Retained nodes' features are element-wise multiplied by their sigmoid-activated attention scores $sigma(mathbf{z}_{ ext{idx}})$, providing a soft weighting that modulates feature magnitudes based on importance — highly scored nodes contribute their full features while borderline nodes are attenuated.
Why SAGPool Matters
- Efficient Hierarchical Pooling: SAGPool requires only one additional GCN layer per pooling step (the attention scorer), compared to DiffPool's two full GNNs and $O(kN)$ dense assignment matrix. This makes SAGPool practical for graphs with thousands of nodes where DiffPool's memory requirements become prohibitive.
- Structure-Preserving Reduction: By retaining the induced subgraph on selected nodes (preserving original edges between retained nodes), SAGPool maintains the topological relationships of important nodes — the coarsened graph is a genuine subgraph of the original, not a soft approximation. This preserves interpretability: the retained nodes are actual nodes from the input graph.
- Interpretability: The attention scores $mathbf{z}$ provide a direct node importance ranking — which nodes does the model consider most informative for the downstream task? For molecular graphs, this can reveal which atoms or functional groups the model focuses on for property prediction, providing chemical interpretability.
- Graph Classification Pipeline: SAGPool is typically used in a hierarchical architecture: [GNN → SAGPool → GNN → SAGPool → ... → Readout], progressively reducing the graph while refining features. The readout combines global mean and max pooling over the final reduced graph. This architecture achieves competitive performance on standard benchmarks (D&D, PROTEINS, NCI1) with significantly fewer parameters than DiffPool.
SAGPool vs. Alternative Pooling Methods
| Method | Score Computation | Memory | Preserves Topology |
|---|---|---|---|
| TopKPool | Linear projection $Xmathbf{p}$ | $O(N)$ | Yes (induced subgraph) |
| SAGPool | GCN attention $ ilde{A}XTheta$ | $O(N + E)$ | Yes (induced subgraph) |
| DiffPool | GNN soft assignment $S in mathbb{R}^{N imes K}$ | $O(NK)$ dense | No (soft approximation) |
| MinCutPool | Spectral objective on $S$ | $O(NK)$ | No (soft approximation) |
| ASAPool | Attention + local structure preservation | $O(N + E)$ | Yes (master nodes) |
SAGPool is context-aware node selection — using graph convolution to evaluate which nodes matter most given their neighborhood context, providing an efficient and interpretable hierarchical pooling strategy that balances structural preservation with learnable importance scoring.
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.