Tensor Field Networks (TFN) are the pioneering framework for 3D rotation-equivariant deep learning on point clouds and molecular structures that defines features not as scalars but as geometric tensors of specified rank — scalars (rank 0), vectors (rank 1), matrices (rank 2), and higher-order tensors — using spherical harmonic basis functions and Clebsch-Gordan tensor products to combine features while maintaining exact SO(3) equivariance — establishing the mathematical foundation for all subsequent equivariant architectures used in molecular modeling, protein structure prediction, and 3D scientific computing.
What Are Tensor Field Networks?
- Definition: Tensor Field Networks (Thomas et al., 2018) represent features at each point (atom, particle) as type-$l$ spherical harmonic tensors — type-0 features are scalars (invariant under rotation), type-1 features are 3D vectors (rotate as vectors), and type-$l$ features transform under the $(2l+1)$-dimensional irreducible representation of SO(3). The network layers combine features of different types using Clebsch-Gordan coefficients, which are the mathematical objects that describe how tensor products of representations decompose.
- Spherical Harmonics: TFNs express spatial relationships between points using real spherical harmonics $Y_l^m(hat{r}_{ij})$, where $hat{r}_{ij}$ is the unit vector from point $i$ to point $j$. This directional encoding captures angular information (bond angles, torsional angles) that distance-only models like EGNNs cannot represent, at the cost of increased computational complexity.
- Tensor Product Layers: The core operation in TFNs is the Clebsch-Gordan tensor product, which combines two features of types $l_1$ and $l_2$ to produce features of type $|l_1 - l_2|$ through $l_1 + l_2$. This operation is the unique mathematical way to combine tensors while preserving SO(3) equivariance, and it replaces the element-wise operations used in standard neural networks.
Why Tensor Field Networks Matter
- Directional Information: TFNs can represent and process directional quantities — force vectors, dipole moments, molecular orbitals — that scalar-only models cannot capture. Predicting that a force acts "in the positive x-direction" requires type-1 features; predicting a stress tensor requires type-2 features. TFNs provide the equivariant framework for outputting these geometric quantities.
- Physical Outputs: Many scientific predictions are tensor-valued — forces are vectors (type-1), polarizability and stress are matrices (type-2), and higher-order response functions are higher-rank tensors. TFNs provide the architectural machinery to produce these outputs with correct transformation properties, which is essential for physics applications.
- Foundation Architecture: TFNs established the blueprint for subsequent architectures: EGNN (simplified to scalar-only messages), SE(3)-Transformers (added attention), NequIP (added efficient message passing), MACE (added body-ordered messages), and Allegro (added local equivariant operations). Understanding TFNs is prerequisite for understanding the entire equivariant deep learning ecosystem.
- Expressiveness vs. Efficiency Trade-off: TFNs demonstrated that higher-order features ($l > 0$) improve model expressiveness for angular-dependent tasks but increase computational cost due to Clebsch-Gordan products. This trade-off — expressiveness vs. efficiency as a function of maximum feature order $l_{max}$ — remains the central design choice in all equivariant architectures.
TFN Feature Hierarchy
| Type $l$ | Dimension | Geometric Object | Physical Example |
|----------|-----------|-----------------|------------------|
| 0 | 1 | Scalar | Energy, charge, temperature |
| 1 | 3 | Vector | Force, velocity, dipole moment |
| 2 | 5 | Rank-2 tensor | Polarizability, quadrupole, stress |
| 3 | 7 | Rank-3 tensor | Octupole moment, piezoelectric tensor |
Tensor Field Networks are vector algebra inside neural networks — performing tensor calculus within hidden layers to model physical systems where scalar representations are insufficient, establishing the mathematical vocabulary for the entire field of equivariant deep learning.