Energy-Based Model (EBM) is a generative model that assigns a scalar energy to each configuration of variables — learning a function $E_\theta(x)$ such that low-energy states correspond to real data and high-energy states to unlikely configurations.
Core Concept
- Probability: $p_\theta(x) = \frac{\exp(-E_\theta(x))}{Z(\theta)}$
- $Z(\theta) = \int \exp(-E_\theta(x)) dx$ — partition function (intractable in general).
- Training: Push $E(x_{real})$ low, push $E(x_{fake})$ high.
- No explicit generative process required — just a scalar score function.
Training Challenges
- Computing $Z(\theta)$: Intractable for continuous high-dimensional data.
- Solution: Contrastive Divergence (CD): Replace exact gradient with approximate using MCMC samples.
- CD-k: Run MCMC for k steps from data points → approximate negative phase.
Restricted Boltzmann Machine (RBM)
- Bipartite graph: Visible units $v$ and hidden units $h$, no intra-layer connections.
- Energy: $E(v,h) = -v^T W h - b^T v - c^T h$
- Exact conditional distributions: $p(h|v)$ and $p(v|h)$ are factorial — efficient Gibbs sampling.
- Deep Belief Networks: Stack of RBMs — early deep learning (Hinton, 2006).
Modern EBMs
- JEM (Joint Energy-Based Model): EBM for both classification and generation.
- Score-based models: $\nabla_x \log p(x)$ (score function) — equivalent to EBM.
- Diffusion models: Can be viewed as hierarchical EBMs.
MCMC Sampling
- Stochastic Gradient Langevin Dynamics (SGLD): Sample from EBM by gradient descent + noise.
- $x_{t+1} = x_t - \alpha \nabla_x E_\theta(x_t) + \epsilon$, $\epsilon \sim N(0,I)$.
Applications
- Anomaly detection: Outliers have high energy.
- Data-efficient learning: EBMs learn compact energy landscape.
- Scientific applications: Molecule energy functions (MMFF, OpenMM).
Energy-based models are a unifying framework connecting Boltzmann machines, diffusion models, and score-based models — their elegant probabilistic formulation makes them particularly powerful for physics-inspired applications and anomaly detection where likelihood estimation matters.
energy based modelebmcontrastive divergenceboltzmann machinerestricted boltzmann
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.