Latent ODEs

Latent ODEs are a generative model for irregularly-sampled time series that combines a Variational Autoencoder framework with Neural ODE dynamics in the latent space — using a recognition network to encode sparse, irregular observations into an initial latent state, a Neural ODE to propagate that state continuously through time, and a decoder to reconstruct observations at arbitrary time points, enabling principled uncertainty quantification, missing value imputation, and generation of smooth continuous trajectories from irregularly-sampled clinical, scientific, or financial data.

The Irregular Time Series Challenge

Standard RNN architectures (LSTM, GRU) assume fixed-interval time steps. Real-world time series are often irregularly sampled:
- Clinical data: Lab measurements at patient-specific visit times (not daily)
- Environmental sensors: Readings at varying intervals based on detected events
- Financial data: Tick data with variable inter-trade intervals
- Astronomical observations: Telescope measurements constrained by weather and scheduling

Standard approaches (zero-imputation, linear interpolation, resampling to regular grid) all discard or distort the temporal structure. Latent ODEs treat irregular sampling as the natural setting.

Architecture

Recognition Network (Encoder): Processes all observations in reverse chronological order using a bidirectional RNN or attention mechanism, producing parameters (μ₀, σ₀) of a Gaussian distribution over the initial latent state z₀.

z₀ ~ N(μ₀, σ₀²) (reparameterization trick enables gradient flow)

Neural ODE Dynamics: The latent state evolves continuously:
dz/dt = f(z, t; θ_ode)

Given the initial latent state z₀, the ODE is integrated to any desired prediction time t:
z(t) = z₀ + ∫₀ᵗ f(z(s), s) ds

The ODE solver (Dopri5) handles arbitrary, irregular prediction times — no discretization required.

Decoder: Maps latent state z(tₙ) to observed space:
x̂(tₙ) = g(z(tₙ); θ_dec)

This can be any architecture — MLP for scalar observations, CNN for image sequences, or domain-specific networks for clinical variables.

Training Objective

The ELBO (Evidence Lower Bound) for Latent ODEs:

ELBO = E_{z₀~q(z₀|x)}[Σₙ log p(xₙ | z(tₙ))] - KL[q(z₀|x) || p(z₀)]

Term 1 (reconstruction): The latent trajectory z(t) should decode back to the observed values at observation times.
Term 2 (regularization): The posterior distribution of z₀ should not deviate too far from the prior (standard Gaussian).

The KL term prevents posterior collapse and enables latent space structure to emerge.

Inference Capabilities

| Task | Latent ODE Approach |
|------|---------------------|
| Reconstruction | Encode all observations, decode at same times |
| Forecasting | Encode observed window, integrate forward to future times |
| Imputation | Encode available observations, decode at missing time points |
| Uncertainty | Sample multiple z₀ from posterior, produces trajectory ensemble |
| Generation | Sample z₀ from prior, integrate ODE, decode at desired times |

Uncertainty Quantification

Unlike deterministic sequence models, Latent ODEs provide principled uncertainty:
- Sampling multiple z₀ from the posterior distribution produces multiple plausible trajectories
- Uncertainty is high where observations are sparse or noisy, low where observations are dense
- The Neural ODE smoothly interpolates between observations rather than producing discontinuous step functions

This calibrated uncertainty is essential for clinical decision support — a model predicting patient deterioration must communicate whether the prediction is confident or uncertain.

Comparison to ODE-RNN

Latent ODE is a generative model (defines joint distribution over trajectories); ODE-RNN is a discriminative model (predicts outputs given inputs). Latent ODE provides better uncertainty quantification and generation capability; ODE-RNN provides simpler training and better performance on prediction tasks where generation is not needed. The two architectures are complementary — Latent ODE for scientific discovery and generation, ODE-RNN for forecasting and classification.

Want to learn more?