Tensor Field Networks (TFN)

Tensor Field Networks (TFN) are the pioneering framework for 3D rotation-equivariant deep learning on point clouds and molecular structures that defines features not as scalars but as geometric tensors of specified rank — scalars (rank 0), vectors (rank 1), matrices (rank 2), and higher-order tensors — using spherical harmonic basis functions and Clebsch-Gordan tensor products to combine features while maintaining exact SO(3) equivariance — establishing the mathematical foundation for all subsequent equivariant architectures used in molecular modeling, protein structure prediction, and 3D scientific computing.

What Are Tensor Field Networks?

- Definition: Tensor Field Networks (Thomas et al., 2018) represent features at each point (atom, particle) as type-$l$ spherical harmonic tensors — type-0 features are scalars (invariant under rotation), type-1 features are 3D vectors (rotate as vectors), and type-$l$ features transform under the $(2l+1)$-dimensional irreducible representation of SO(3). The network layers combine features of different types using Clebsch-Gordan coefficients, which are the mathematical objects that describe how tensor products of representations decompose.
- Spherical Harmonics: TFNs express spatial relationships between points using real spherical harmonics $Y_l^m(hat{r}_{ij})$, where $hat{r}_{ij}$ is the unit vector from point $i$ to point $j$. This directional encoding captures angular information (bond angles, torsional angles) that distance-only models like EGNNs cannot represent, at the cost of increased computational complexity.
- Tensor Product Layers: The core operation in TFNs is the Clebsch-Gordan tensor product, which combines two features of types $l_1$ and $l_2$ to produce features of type $|l_1 - l_2|$ through $l_1 + l_2$. This operation is the unique mathematical way to combine tensors while preserving SO(3) equivariance, and it replaces the element-wise operations used in standard neural networks.

Why Tensor Field Networks Matter

- Directional Information: TFNs can represent and process directional quantities — force vectors, dipole moments, molecular orbitals — that scalar-only models cannot capture. Predicting that a force acts "in the positive x-direction" requires type-1 features; predicting a stress tensor requires type-2 features. TFNs provide the equivariant framework for outputting these geometric quantities.
- Physical Outputs: Many scientific predictions are tensor-valued — forces are vectors (type-1), polarizability and stress are matrices (type-2), and higher-order response functions are higher-rank tensors. TFNs provide the architectural machinery to produce these outputs with correct transformation properties, which is essential for physics applications.
- Foundation Architecture: TFNs established the blueprint for subsequent architectures: EGNN (simplified to scalar-only messages), SE(3)-Transformers (added attention), NequIP (added efficient message passing), MACE (added body-ordered messages), and Allegro (added local equivariant operations). Understanding TFNs is prerequisite for understanding the entire equivariant deep learning ecosystem.
- Expressiveness vs. Efficiency Trade-off: TFNs demonstrated that higher-order features ($l > 0$) improve model expressiveness for angular-dependent tasks but increase computational cost due to Clebsch-Gordan products. This trade-off — expressiveness vs. efficiency as a function of maximum feature order $l_{max}$ — remains the central design choice in all equivariant architectures.

TFN Feature Hierarchy

| Type $l$ | Dimension | Geometric Object | Physical Example |
|----------|-----------|-----------------|------------------|
| 0 | 1 | Scalar | Energy, charge, temperature |
| 1 | 3 | Vector | Force, velocity, dipole moment |
| 2 | 5 | Rank-2 tensor | Polarizability, quadrupole, stress |
| 3 | 7 | Rank-3 tensor | Octupole moment, piezoelectric tensor |

Tensor Field Networks are vector algebra inside neural networks — performing tensor calculus within hidden layers to model physical systems where scalar representations are insufficient, establishing the mathematical vocabulary for the entire field of equivariant deep learning.

Want to learn more?