Latent ODEs

Keywords: latent odes, neural architecture

Latent ODEs are a generative model for irregularly-sampled time series that combines a Variational Autoencoder framework with Neural ODE dynamics in the latent space โ€” using a recognition network to encode sparse, irregular observations into an initial latent state, a Neural ODE to propagate that state continuously through time, and a decoder to reconstruct observations at arbitrary time points, enabling principled uncertainty quantification, missing value imputation, and generation of smooth continuous trajectories from irregularly-sampled clinical, scientific, or financial data.

The Irregular Time Series Challenge

Standard RNN architectures (LSTM, GRU) assume fixed-interval time steps. Real-world time series are often irregularly sampled:
- Clinical data: Lab measurements at patient-specific visit times (not daily)
- Environmental sensors: Readings at varying intervals based on detected events
- Financial data: Tick data with variable inter-trade intervals
- Astronomical observations: Telescope measurements constrained by weather and scheduling

Standard approaches (zero-imputation, linear interpolation, resampling to regular grid) all discard or distort the temporal structure. Latent ODEs treat irregular sampling as the natural setting.

Architecture

Recognition Network (Encoder): Processes all observations in reverse chronological order using a bidirectional RNN or attention mechanism, producing parameters (ฮผโ‚€, ฯƒโ‚€) of a Gaussian distribution over the initial latent state zโ‚€.

zโ‚€ ~ N(ฮผโ‚€, ฯƒโ‚€ยฒ) (reparameterization trick enables gradient flow)

Neural ODE Dynamics: The latent state evolves continuously:
dz/dt = f(z, t; ฮธ_ode)

Given the initial latent state zโ‚€, the ODE is integrated to any desired prediction time t:
z(t) = zโ‚€ + โˆซโ‚€แต— f(z(s), s) ds

The ODE solver (Dopri5) handles arbitrary, irregular prediction times โ€” no discretization required.

Decoder: Maps latent state z(tโ‚™) to observed space:
xฬ‚(tโ‚™) = g(z(tโ‚™); ฮธ_dec)

This can be any architecture โ€” MLP for scalar observations, CNN for image sequences, or domain-specific networks for clinical variables.

Training Objective

The ELBO (Evidence Lower Bound) for Latent ODEs:

ELBO = E_{zโ‚€~q(zโ‚€|x)}[ฮฃโ‚™ log p(xโ‚™ | z(tโ‚™))] - KL[q(zโ‚€|x) || p(zโ‚€)]

Term 1 (reconstruction): The latent trajectory z(t) should decode back to the observed values at observation times.
Term 2 (regularization): The posterior distribution of zโ‚€ should not deviate too far from the prior (standard Gaussian).

The KL term prevents posterior collapse and enables latent space structure to emerge.

Inference Capabilities

| Task | Latent ODE Approach |
|------|---------------------|
| Reconstruction | Encode all observations, decode at same times |
| Forecasting | Encode observed window, integrate forward to future times |
| Imputation | Encode available observations, decode at missing time points |
| Uncertainty | Sample multiple zโ‚€ from posterior, produces trajectory ensemble |
| Generation | Sample zโ‚€ from prior, integrate ODE, decode at desired times |

Uncertainty Quantification

Unlike deterministic sequence models, Latent ODEs provide principled uncertainty:
- Sampling multiple zโ‚€ from the posterior distribution produces multiple plausible trajectories
- Uncertainty is high where observations are sparse or noisy, low where observations are dense
- The Neural ODE smoothly interpolates between observations rather than producing discontinuous step functions

This calibrated uncertainty is essential for clinical decision support โ€” a model predicting patient deterioration must communicate whether the prediction is confident or uncertain.

Comparison to ODE-RNN

Latent ODE is a generative model (defines joint distribution over trajectories); ODE-RNN is a discriminative model (predicts outputs given inputs). Latent ODE provides better uncertainty quantification and generation capability; ODE-RNN provides simpler training and better performance on prediction tasks where generation is not needed. The two architectures are complementary โ€” Latent ODE for scientific discovery and generation, ODE-RNN for forecasting and classification.

Want to learn more?

Search 13,225+ semiconductor and AI topics or chat with our AI assistant.

Search Topics Chat with CFSGPT