stop tokens, text generation
Tokens indicating end of generation.
9,967 technical terms and definitions
Tokens indicating end of generation.
Prevent collapse by stopping gradients.
Conditions for ending generation.
Temperature and humidity limits.
High-throughput storage for training data.
Stochastic Temporal Optimal control using Recurrent Networks combines control theory with stochastic RNNs.
Create coherent narratives.
Generate creative stories. Plot, characters, dialogue.
Straight fin heat sinks use parallel plates optimized for unidirectional airflow.
Vertical leads.
Straight-through estimators approximate gradients through non-differentiable quantization operations.
Straight-through Gumbel-Softmax enables gradient-based learning with discrete variables.
Apply mechanical strain to increase carrier mobility.
Silicon with stress to improve mobility.
Strategic sourcing aligns procurement with business objectives optimizing cost quality and risk.
Strategy adaptation modifies approaches based on success feedback.
Stratification analyzes data separately by source revealing hidden patterns.
Sample points along rays.
Stratified sampling ensures representation from each important subgroup.
Stratified split maintains class proportions. Important for imbalanced.
Streaming generation outputs tokens incrementally as generated improving perceived latency.
Update cache as sequence grows.
Process infinite sequences.
GPU building blocks.
Streaming sends tokens as generated via SSE or WebSocket. User sees output immediately, feels faster.
Streamlit builds Python web apps fast. Great for ML demos. Interactive widgets.
Stress engineering introduces mechanical strain in channels to modulate mobility and enhance drive current.
Deliberately apply stress to enhance performance.
Induce stress then remove stressor.
Copper movement from thermal stress.
Model thermal stress effects.
Metal movement due to thermal stress.
Reduce stress in thinned wafer.
Expose defects before shipment.
Compute mechanical stress from processing.
Test model under distribution shift.
Stress-induced voids form in interconnects when mechanical stress gradients exceed critical thresholds.
Relate Raman shift to stress.
Attend to every k-th position.
Two types of hard-to-change factors.
Stripe provides payment processing. Easy integration.
# Structural Time Series Models ## STS Structural time series (STS) models, also called **state space models** or **unobserved components models**, decompose a time series into interpretable components—each representing a distinct source of variation. ## 1. Core Components A structural time series model decomposes an observed series $y_t$ into additive components: $$ y_t = \mu_t + \gamma_t + \psi_t + X_t\beta + \varepsilon_t $$ Where: - $\mu_t$ — Trend component - $\gamma_t$ — Seasonal component - $\psi_t$ — Cyclical component - $X_t\beta$ — Regression/explanatory effects - $\varepsilon_t$ — Irregular (white noise) component ## 2. Component Specifications ### 2.1 Trend Component The trend ($\mu_t$) captures the underlying level and growth pattern of the series. #### Local Level Model (Random Walk) $$ \mu_t = \mu_{t-1} + \eta_t, \quad \eta_t \sim N(0, \sigma_\eta^2) $$ - Level evolves as a random walk - No slope/growth rate component - Suitable for series without systematic growth #### Local Linear Trend Model $$ \begin{aligned} \mu_t &= \mu_{t-1} + \nu_{t-1} + \eta_t, \quad \eta_t \sim N(0, \sigma_\eta^2) \\ \nu_t &= \nu_{t-1} + \zeta_t, \quad \zeta_t \sim N(0, \sigma_\zeta^2) \end{aligned} $$ - $\mu_t$ — Stochastic level - $\nu_t$ — Stochastic slope (growth rate) - Both level and slope evolve over time - When $\sigma_\zeta^2 = 0$: slope is fixed (deterministic growth) - When $\sigma_\eta^2 = 0$: smooth trend (integrated random walk) #### Smooth Trend (Integrated Random Walk) $$ \begin{aligned} \mu_t &= \mu_{t-1} + \nu_{t-1} \\ \nu_t &= \nu_{t-1} + \zeta_t, \quad \zeta_t \sim N(0, \sigma_\zeta^2) \end{aligned} $$ - Level changes are smooth (no level disturbance) - Only slope receives stochastic shocks #### Deterministic Trend $$ \mu_t = \alpha + \beta t $$ - Fixed intercept $\alpha$ and slope $\beta$ - No stochastic evolution ### 2.2 Seasonal Component The seasonal component ($\gamma_t$) captures recurring patterns at fixed intervals. #### Dummy Variable Form $$ \gamma_t = -\sum_{j=1}^{s-1} \gamma_{t-j} + \omega_t, \quad \omega_t \sim N(0, \sigma_\omega^2) $$ - $s$ — Number of seasons (e.g., $s=12$ for monthly data) - Seasonal effects sum to zero over a complete cycle - When $\sigma_\omega^2 = 0$: deterministic (fixed) seasonality #### Trigonometric/Fourier Form $$ \gamma_t = \sum_{j=1}^{[s/2]} \gamma_{j,t} $$ Each harmonic $j$ follows: $$ \begin{bmatrix} \gamma_{j,t} \\ \gamma_{j,t}^* \end{bmatrix} = \begin{bmatrix} \cos \lambda_j & \sin \lambda_j \\ -\sin \lambda_j & \cos \lambda_j \end{bmatrix} \begin{bmatrix} \gamma_{j,t-1} \\ \gamma_{j,t-1}^* \end{bmatrix} + \begin{bmatrix} \omega_{j,t} \\ \omega_{j,t}^* \end{bmatrix} $$ Where: - $\lambda_j = \frac{2\pi j}{s}$ — Frequency of harmonic $j$ - $\omega_{j,t}, \omega_{j,t}^* \sim N(0, \sigma_\omega^2)$ - Allows different variances for different harmonics - More parsimonious when few harmonics are needed ### 2.3 Cyclical Component The cyclical component ($\psi_t$) captures medium-term fluctuations not tied to fixed calendar periods. $$ \begin{bmatrix} \psi_t \\ \psi_t^* \end{bmatrix} = \rho \begin{bmatrix} \cos \lambda_c & \sin \lambda_c \\ -\sin \lambda_c & \cos \lambda_c \end{bmatrix} \begin{bmatrix} \psi_{t-1} \\ \psi_{t-1}^* \end{bmatrix} + \begin{bmatrix} \kappa_t \\ \kappa_t^* \end{bmatrix} $$ Where: - $\lambda_c \in (0, \pi)$ — Cycle frequency - $\rho \in (0, 1)$ — Damping factor (ensures stationarity) - $\kappa_t, \kappa_t^* \sim N(0, \sigma_\kappa^2)$ - Period of cycle: $\frac{2\pi}{\lambda_c}$ time units ### 2.4 Regression Component The regression component ($X_t\beta$) incorporates explanatory variables: $$ \text{Regression effect} = \sum_{k=1}^{K} \beta_k x_{k,t} $$ Common applications: - **Intervention effects**: Step functions, pulse dummies, ramp effects - **Calendar effects**: Trading days, holidays, leap years - **Explanatory variables**: Economic indicators, weather, etc. #### Time-Varying Coefficients (Optional) $$ \beta_t = \beta_{t-1} + \xi_t, \quad \xi_t \sim N(0, \sigma_\xi^2) $$ ### 2.5 Irregular Component The irregular component ($\varepsilon_t$) is white noise: $$ \varepsilon_t \sim N(0, \sigma_\varepsilon^2) $$ - White noise (serially uncorrelated) - Captures measurement error and short-term fluctuations - Also called "observation noise" ## 3. State Space Representation ### 3.1 General Form Any structural time series model can be written in state space form: **Observation Equation:** $$ y_t = Z_t \alpha_t + \varepsilon_t, \quad \varepsilon_t \sim N(0, H_t) $$ **State Equation:** $$ \alpha_{t+1} = T_t \alpha_t + R_t \eta_t, \quad \eta_t \sim N(0, Q_t) $$ Where: - $y_t$ — Observed data (scalar or vector) - $\alpha_t$ — State vector (unobserved components) - $Z_t$ — Observation matrix (links states to observations) - $T_t$ — Transition matrix (governs state evolution) - $R_t$ — Selection matrix - $H_t$ — Observation noise variance - $Q_t$ — State noise covariance matrix ### 3.2 Example: Local Linear Trend + Seasonal State vector: $$ \alpha_t = \begin{bmatrix} \mu_t \\ \nu_t \\ \gamma_t \\ \gamma_{t-1} \\ \vdots \\ \gamma_{t-s+2} \end{bmatrix} $$ ## 4. Estimation via Kalman Filter ### 4.1 Kalman Filter Recursions **Prediction Step:** $$ \begin{aligned} \alpha_{t|t-1} &= T_t \alpha_{t-1|t-1} \\ P_{t|t-1} &= T_t P_{t-1|t-1} T_t' + R_t Q_t R_t' \end{aligned} $$ **Update Step:** $$ \begin{aligned} v_t &= y_t - Z_t \alpha_{t|t-1} \quad \text{(prediction error)} \\ F_t &= Z_t P_{t|t-1} Z_t' + H_t \quad \text{(prediction error variance)} \\ K_t &= P_{t|t-1} Z_t' F_t^{-1} \quad \text{(Kalman gain)} \\ \alpha_{t|t} &= \alpha_{t|t-1} + K_t v_t \\ P_{t|t} &= (I - K_t Z_t) P_{t|t-1} \end{aligned} $$ Where: - $\alpha_{t|t-1}$ — Predicted state (prior) - $\alpha_{t|t}$ — Filtered state (posterior) - $P_{t|t-1}$ — Predicted state covariance - $P_{t|t}$ — Filtered state covariance ### 4.2 Kalman Smoother Refines estimates using full sample (backward pass): $$ \begin{aligned} \alpha_{t|n} &= \alpha_{t|t} + P_{t|t} T_{t+1}' P_{t+1|t}^{-1} (\alpha_{t+1|n} - \alpha_{t+1|t}) \\ P_{t|n} &= P_{t|t} + P_{t|t} T_{t+1}' P_{t+1|t}^{-1} (P_{t+1|n} - P_{t+1|t}) P_{t+1|t}^{-1} T_{t+1} P_{t|t} \end{aligned} $$ Where $n$ is the total number of observations. ## 5. Hyperparameter Estimation ### 5.1 Maximum Likelihood The log-likelihood is computed via prediction error decomposition: $$ \log L(\theta) = -\frac{n}{2} \log(2\pi) - \frac{1}{2} \sum_{t=1}^{n} \left( \log |F_t| + v_t' F_t^{-1} v_t \right) $$ Where: - $\theta$ — Vector of hyperparameters (variance terms) - $v_t$ — Prediction errors from Kalman filter - $F_t$ — Prediction error variances Optimization methods: - Quasi-Newton (BFGS, L-BFGS) - EM algorithm - Scoring algorithms ### 5.2 Bayesian Estimation $$ p(\theta | y_{1:n}) \propto p(y_{1:n} | \theta) \cdot p(\theta) $$ Common approaches: - **MCMC**: Gibbs sampling, Hamiltonian Monte Carlo - **Variational inference**: Faster approximation - **Integrated nested Laplace approximation (INLA)** Common priors: - Inverse-gamma for variance parameters - Half-Cauchy or half-normal for scale parameters ## 6. Model Selection and Diagnostics ### 6.1 Information Criteria $$ \begin{aligned} \text{AIC} &= -2 \log L + 2k \\ \text{BIC} &= -2 \log L + k \log n \\ \text{AICc} &= \text{AIC} + \frac{2k(k+1)}{n-k-1} \end{aligned} $$ Where $k$ is the number of hyperparameters. ### 6.2 Diagnostic Checks Standardized prediction errors should be: - **Zero mean**: $E[v_t / \sqrt{F_t}] = 0$ - **Unit variance**: $\text{Var}[v_t / \sqrt{F_t}] = 1$ - **Serially uncorrelated**: Check with Ljung-Box test - **Normally distributed**: Check with Jarque-Bera test ### 6.3 Auxiliary Residuals - **Observation residuals**: Detect outliers - **State residuals**: Detect structural breaks $$ \begin{aligned} e_t &= \frac{y_t - Z_t \alpha_{t|n}}{\sqrt{\text{Var}(y_t - Z_t \alpha_{t|n})}} \\ r_t &= \frac{\eta_t}{\sqrt{\text{Var}(\eta_t)}} \end{aligned} $$ ## 7. Comparison | Approach | Philosophy | Strengths | Limitations | |:---------|:-----------|:----------|:------------| | **ARIMA** | Reduced-form; models stationary transformations | Parsimonious, well-understood | Components not interpretable | | **Exponential Smoothing** | Weighted averages with decay | Simple, effective | Less flexible seasonality | | **Structural TS** | Explicit component decomposition | Interpretable, handles missing data | More parameters | | **Prophet** | Additive trend + seasonality + holidays | User-friendly | Less rigorous uncertainty | | **Deep Learning** | Learn patterns from data | Powerful with big data | Black box, data hungry | ## 8. Topics ### 8.1 Handling Missing Data The Kalman filter naturally handles missing observations: - When $y_t$ is missing, skip the update step - Prediction step proceeds normally - Smoother propagates information through gaps ### 8.2 Multivariate Extensions For vector $y_t \in \mathbb{R}^p$: $$ y_t = Z_t \alpha_t + \varepsilon_t, \quad \varepsilon_t \sim N(0, H_t) $$ Applications: - Common trends across multiple series - Factor models - Dynamic factor analysis ### 8.3 Non-Gaussian Extensions - **Student-t errors**: Heavy tails, robust to outliers - **Mixture models**: Regime switching - **Non-linear state space**: Extended Kalman filter, particle filters ## 9. Software Implementations ### R Packages ```r KFAS - Kalman Filter and Smoother library(KFAS) model <- SSModel(y ~ SSMtrend(2, Q = list(NA, NA)) + SSMseasonal(12, Q = NA), H = NA) fit <- fitSSM(model, inits = rep(0, 4)) bsts - Bayesian Structural Time Series library(bsts) ss <- AddLocalLinearTrend(list(), y) ss <- AddSeasonal(ss, y, nseasons = 12) model <- bsts(y, state.specification = ss, niter = 1000) dlm - Dynamic Linear Models library(dlm) build <- function(theta) { dlmModPoly(2, dV = exp(theta[1]), dW = exp(theta[2:3])) + dlmModSeas(12, dV = 0, dW = exp(theta[4])) } fit <- dlmMLE(y, parm = rep(0, 4), build = build) ``` ### Python ```python statsmodels from statsmodels.tsa.statespace.structural import UnobservedComponents model = UnobservedComponents( y, level='local linear trend', seasonal=12, stochastic_seasonal=True ) results = model.fit() TensorFlow Probability import tensorflow_probability as tfp trend = tfp.sts.LocalLinearTrend(observed_time_series=y) seasonal = tfp.sts.Seasonal(num_seasons=12, observed_time_series=y) model = tfp.sts.Sum([trend, seasonal], observed_time_series=y) ``` ## 11. Structural time series models Structural time series models provide: - **Interpretability**: Each component has clear economic/statistical meaning - **Flexibility**: Add/remove components based on domain knowledge - **Robustness**: Natural handling of missing data and irregular spacing - **Uncertainty quantification**: Full probability distributions for components and forecasts - **Intervention analysis**: Easy incorporation of known breaks and policy changes The state space framework unifies estimation, filtering, smoothing, and forecasting within a coherent probabilistic structure, making structural time series models a powerful tool for understanding and predicting temporal phenomena.
Recover 3D structure from 2D images.
Reconstruct 3D from video.
Geometric and topological descriptors.
Predefined sparsity patterns.
Generate outputs in specific formats (JSON XML code) using grammar constraints.
Structured logs (JSON) are searchable and parseable. Include context, metrics, request IDs.
Extract structured data from generation.
Structured output constrains generation to follow specified formats like JSON or schemas.