Graph Neural Networks (GNNs)

Graph Neural Networks (GNNs) are the deep learning architectures designed to operate on graph-structured data — where entities (nodes) and their relationships (edges) form irregular, non-Euclidean structures that cannot be processed by standard CNNs or sequence models, enabling learned representations for molecular property prediction, social network analysis, recommendation systems, circuit design, and combinatorial optimization.

Why Graphs Need Specialized Architectures

Images have regular grid structure; text has sequential structure. Graphs have arbitrary topology — varying node degrees, no natural ordering, and permutation invariance requirements. A 2D convolution kernel has no meaning on a graph. GNNs define operations that respect graph structure through message passing between connected nodes.

Message Passing Framework

All GNNs follow the message-passing paradigm:
1. Message: Each node aggregates information from its neighbors: mᵢ = AGG({hⱼ : j ∈ N(i)})
2. Update: Each node updates its representation by combining its current state with the aggregated message: hᵢ' = UPDATE(hᵢ, mᵢ)
3. Repeat: K rounds of message passing allow information to propagate K hops through the graph.

The choice of AGG and UPDATE functions defines different GNN variants:

- GCN (Graph Convolutional Network): Normalized sum of neighbor features followed by a linear transformation. hᵢ' = σ(Σⱼ (1/√(dᵢdⱼ)) · W · hⱼ). Simple, effective, but treats all neighbors equally.

- GAT (Graph Attention Network): Learns attention weights (αᵢⱼ) between node pairs, allowing the model to focus on the most relevant neighbors: hᵢ' = σ(Σⱼ αᵢⱼ · W · hⱼ). Attention is computed from concatenated node features.

- GraphSAGE: Samples a fixed number of neighbors (instead of using all) and applies learnable aggregation functions (mean, LSTM, or max-pool). Enables inductive learning on unseen nodes.

- GIN (Graph Isomorphism Network): Provably as powerful as the 1-WL graph isomorphism test — the theoretical upper bound for message-passing GNNs. Uses sum aggregation with a learned epsilon parameter.

Common Tasks

- Node Classification: Predict labels for individual nodes (user categorization in social networks, atom type prediction).
- Edge Classification/Prediction: Predict edge existence or properties (drug-drug interaction, link prediction in knowledge graphs).
- Graph Classification: Predict a property of the entire graph (molecular toxicity, circuit functionality). Requires a graph-level readout (pooling) layer.

Over-Squashing and Depth Limitations

GNNs suffer from over-squashing: information from distant nodes is compressed into fixed-size vectors through repeated aggregation. This limits the effective receptive field to 3-5 hops for most architectures. Graph Transformers (e.g., GPS, Graphormer) add global attention to supplement local message passing.

Graph Neural Networks are the deep learning paradigm that extends neural computation beyond grids and sequences — bringing the power of learned representations to the rich, irregular relational structures that describe molecules, networks, and systems.

Want to learn more?