Home Knowledge Base Tukey's biweight loss

Tukey's biweight loss is an M-estimator loss function that completely and absolutely ignores errors exceeding a threshold — providing hard outlier rejection where the gradient vanishes for extreme deviations, enabling models to learn data patterns despite massive contamination from gross errors, the ultimate robustness for filtering erroneous data.

What Is Tukey's Biweight Loss?

Tukey's biweight (also called bisquare) is a redescending M-estimator from robust statistics that behaves like a quadratic penalty near zero, gradually decreases in influence for moderate errors, and completely rejects (zero gradient) for large errors beyond threshold c. This is the ultimate form of outlier rejection — unlike Huber and Cauchy where large errors still contribute some gradient, Tukey completely ignores them.

Mathematical Definition

Tukey biweight loss:

ρ(x) = 
  (c²/6) * [1 - (1 - (x/c)²)³]  if |x| ≤ c  (influence region)
  c²/6                           if |x| > c  (rejection region)

Weight function w(x) = (1 - (x/c)²)² if |x| ≤ c, else 0
Gradient: ∂ρ/∂x = x * (1 - (x/c)²)² if |x| ≤ c, else 0

Three distinct regions: 1. |x| < c: Quadratic-like behavior with influence gradually decreasing 2. |x| = c: Transition point where influence reaches maximum 3. |x| > c: Gradient exactly zero — complete outlier rejection

Why Tukey's Biweight Matters

The Redescending Property

Unlike Huber and Cauchy where influence monotonically increases, Tukey's biweight reaches maximum influence at error = 0.3c, then decreases, reaching zero at c:

Influence vs Error Magnitude:
|
|     ╱╲
|    ╱  ╲
|   ╱    ╲___
|  ╱         (zero influence beyond c)
|___________|____
0      c

Comparison: Outlier Rejection Approaches

Error = 5cMSEHuberCauchyTukey
Loss(5c)² = 25c²5c * c = 5c²c² ln(26) ≈ 3.3c²c²/6 ≈ 0.167c²
InfluenceExtremeHighModerateZero
Gradient Magnitude10ccSmallExactly 0

Parameter Selection

Implementation

PyTorch:

def tukey_biweight_loss(predictions, targets, c=1.0):
    errors = (predictions - targets)
    mask = (errors.abs() <= c).float()
    term = 1 - (errors / c) ** 2
    loss = (c**2 / 6) * mask * (1 - term ** 3)
    return loss.mean()

NumPy (for offline analysis):

import numpy as np

def tukey_biweight(x, c=1.0):
    mask = np.abs(x) <= c
    loss = np.zeros_like(x, dtype=float)
    loss[mask] = (c**2/6) * (1 - (1 - (x[mask]/c)**2)**3)
    loss[~mask] = c**2/6
    return loss

When to Use Tukey's Biweight

Practical Applications

Robust Least Squares: Fitting lines, planes, curves to data with gross errors — automatic leverage point rejection enables fitting despite bad measurements.

Astronomical Data: Detecting planets from stellar brightness where cosmic rays and instrumental glitches contaminate significant portion of measurements; Tukey enables using all data while ignoring artifact-corrupted observations.

Survey Data: Statistical analysis of survey responses with occasional fraudulent/nonsense entries; Tukey automatically downweights or ignores impossible values without manual cleaning.

Geospatial Analysis: GPS trajectories with occasional wild spikes (multipath, jamming); Tukey filters outlier positions while preserving real movements.

Quality Control: Manufacturing processes flagging and ignoring equipment malfunctions while maintaining statistical model of normal operations.

Tukey's biweight is the maximum-robustness outlier elimination — hard rejection for gross errors enables learning from contaminated data that would destroy other methods, providing theoretical guarantee of robustness even with 50% contamination.

tukey biweightm-estimatoroutlier rejection

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.