Semiconductor Manufacturing Process SPC: Statistical Process Control Mathematics
Keywords: capability, cpk, cp, capability index, six sigma, dpmo, statistical process control, SPC mathematics
Semiconductor Manufacturing Process SPC: Statistical Process Control Mathematics
1. Introduction
Why SPC Mathematics Matters in Semiconductor Fabs
Semiconductor manufacturing operates at nanometer scales across hundreds of process steps, presenting unique challenges:
- High Value: A single wafer can be worth $10,000 to $100,000+
- Tight Tolerances: Process variations of a few nanometers cause yield collapse
- Long Feedback Loops: Days to weeks between process and measurement
- Compounding Variation: Multiple variance sources multiply through the process flow
The mathematics of SPC provides the framework to:
- Detect process shifts before they cause defects
- Quantify and decompose sources of variation
- Maintain processes within nanometer-scale tolerances
- Optimize yield through statistical understanding
2. Fundamental Statistical Measures
2.1 Descriptive Statistics
For a sample of $n$ measurements $x_1, x_2, \ldots, x_n$:
| Measure | Formula | Description |
|---|---|---|
| Sample Mean | $\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$ | Central tendency |
| Sample Variance | $s^2 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}$ | Spread (unbiased) |
| Sample Std Dev | $s = \sqrt{s^2}$ | Spread in original units |
| Range | $R = x_{max} - x_{min}$ | Total spread |
2.2 The Normal (Gaussian) Distribution
The mathematical backbone of classical SPC:
$$ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) $$
Where:
- $\mu$ = population mean
- $\sigma$ = population standard deviation
- $\sigma^2$ = population variance
2.3 Critical Probability Intervals
| Interval | Probability Contained | Application |
|---|---|---|
| $\pm 1\sigma$ | 68.27% | Typical variation |
| $\pm 2\sigma$ | 95.45% | Warning limits |
| $\pm 3\sigma$ | 99.73% | Control limits |
| $\pm 4\sigma$ | 99.9937% | Cpk = 1.33 |
| $\pm 5\sigma$ | 99.99994% | Cpk = 1.67 |
| $\pm 6\sigma$ | 99.9999998% | Six Sigma (3.4 DPMO) |
2.4 Standard Normal Transformation
Any normal variable can be standardized:
$$ Z = \frac{X - \mu}{\sigma} $$
Where $Z \sim N(0, 1)$ (standard normal distribution).
3. Control Chart Mathematics
3.1 Shewhart X̄-R Charts
The workhorse of semiconductor SPC for monitoring subgroup data.
X̄ Chart (Monitoring Process Mean)
$$ \begin{aligned} CL &= \bar{\bar{X}} \quad \text{(grand mean of subgroup means)} \\ UCL &= \bar{\bar{X}} + A_2 \bar{R} \\ LCL &= \bar{\bar{X}} - A_2 \bar{R} \end{aligned} $$
Theoretical basis:
$$ UCL / LCL = \mu \pm \frac{3\sigma}{\sqrt{n}} $$
R Chart (Monitoring Process Spread)
$$ \begin{aligned} CL &= \bar{R} \\ UCL &= D_4 \bar{R} \\ LCL &= D_3 \bar{R} \end{aligned} $$
Control Chart Constants
| $n$ | $A_2$ | $D_3$ | $D_4$ | $d_2$ |
|---|---|---|---|---|
| 2 | 1.880 | 0 | 3.267 | 1.128 |
| 3 | 1.023 | 0 | 2.574 | 1.693 |
| 4 | 0.729 | 0 | 2.282 | 2.059 |
| 5 | 0.577 | 0 | 2.114 | 2.326 |
| 6 | 0.483 | 0 | 2.004 | 2.534 |
| 7 | 0.419 | 0.076 | 1.924 | 2.704 |
| 8 | 0.373 | 0.136 | 1.864 | 2.847 |
| 9 | 0.337 | 0.184 | 1.816 | 2.970 |
| 10 | 0.308 | 0.223 | 1.777 | 3.078 |
3.2 Individuals-Moving Range (I-MR) Charts
Common in semiconductor when rational subgrouping isn't practical (e.g., one measurement per wafer lot).
Individuals Chart
$$ \begin{aligned} CL &= \bar{X} \\ UCL &= \bar{X} + 3 \cdot \frac{\overline{MR}}{d_2} \\ LCL &= \bar{X} - 3 \cdot \frac{\overline{MR}}{d_2} \end{aligned} $$
Where $d_2 = 1.128$ for moving range of span 2.
Moving Range Chart
$$ \begin{aligned} CL &= \overline{MR} \\ UCL &= D_4 \cdot \overline{MR} = 3.267 \cdot \overline{MR} \\ LCL &= D_3 \cdot \overline{MR} = 0 \end{aligned} $$
3.3 EWMA Charts (Exponentially Weighted Moving Average)
More sensitive to small, persistent shifts than Shewhart charts.
EWMA Statistic
$$ EWMA_t = \lambda x_t + (1-\lambda) EWMA_{t-1} $$
Where:
- $\lambda$ = smoothing parameter ($0 < \lambda \leq 1$, typically 0.05–0.25)
- $EWMA_0 = \mu_0$ (target mean)
Control Limits (Time-Varying)
$$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}\left[1-(1-\lambda)^{2t}\right]} $$
Asymptotic Control Limits
As $t \to \infty$:
$$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}} $$
Typical parameters:
- $\lambda = 0.10$ to $0.20$
- $L = 2.5$ to $3.0$
EWMA Variance
$$ Var(EWMA_t) = \sigma^2 \cdot \frac{\lambda}{2-\lambda} \left[1 - (1-\lambda)^{2t}\right] $$
3.4 CUSUM Charts (Cumulative Sum)
Accumulates deviations from target—excellent for detecting sustained shifts.
Tabular CUSUM
Upper CUSUM (detecting upward shifts):
$$ C_t^+ = \max\left[0, x_t - (\mu_0 + K) + C_{t-1}^+\right] $$
Lower CUSUM (detecting downward shifts):
$$ C_t^- = \max\left[0, (\mu_0 - K) - x_t + C_{t-1}^-\right] $$
Signal condition:
$$ C_t^+ > H \quad \text{or} \quad C_t^- > H $$
Where:
- $K$ = allowable slack (reference value), typically $0.5\sigma$
- $H$ = decision interval, typically $4\sigma$ to $5\sigma$
- $C_0^+ = C_0^- = 0$
Standardized Form
For standardized observations $z_t = (x_t - \mu_0)/\sigma$:
$$ \begin{aligned} S_t^+ &= \max(0, z_t - k + S_{t-1}^+) \\ S_t^- &= \max(0, -z_t - k + S_{t-1}^-) \end{aligned} $$
With $k = 0.5$ (half the shift to detect) and $h = 4$ or $5$.
4. Process Capability Indices
4.1 Cp (Potential Capability)
Measures the ratio of specification width to process spread:
$$ C_p = \frac{USL - LSL}{6\sigma} $$
Where:
- $USL$ = Upper Specification Limit
- $LSL$ = Lower Specification Limit
- $\sigma$ = process standard deviation
Interpretation:
- $C_p$ does not account for centering
- Represents potential capability if process were perfectly centered
4.2 Cpk (Actual Capability)
Accounts for off-center processes:
$$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$
Alternative formulation:
$$ C_{pk} = C_p(1 - k) $$
Where $k = \frac{|T - \mu|}{(USL - LSL)/2}$ and $T$ is the target (specification midpoint).
Key property: $C_{pk} \leq C_p$ always.
4.3 Cpm (Taguchi Capability Index)
Penalizes deviation from target $T$:
$$ C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}} $$
Or equivalently:
$$ C_{pm} = \frac{C_p}{\sqrt{1 + \left(\frac{\mu - T}{\sigma}\right)^2}} $$
4.4 Pp and Ppk (Performance Indices)
Same formulas but use overall standard deviation (including between-subgroup variation):
$$ P_p = \frac{USL - LSL}{6s_{overall}} $$
$$ P_{pk} = \min\left[\frac{USL - \bar{x}}{3s_{overall}}, \frac{\bar{x} - LSL}{3s_{overall}}\right] $$
Relationship:
- $C_p, C_{pk}$: Use within-subgroup $\sigma$ (short-term capability)
- $P_p, P_{pk}$: Use overall $s$ (long-term performance)
4.5 Relating Cpk to Defect Rates
| $C_{pk}$ | $\sigma$-level | DPMO | Yield |
|---|---|---|---|
| 0.33 | 1σ | 317,311 | 68.27% |
| 0.67 | 2σ | 45,500 | 95.45% |
| 1.00 | 3σ | 2,700 | 99.73% |
| 1.33 | 4σ | 63 | 99.9937% |
| 1.67 | 5σ | 0.57 | 99.99994% |
| 2.00 | 6σ | 0.002 | 99.9999998% |
Note: With 1.5σ shift allowance (industry standard), 6σ = 3.4 DPMO.
4.6 Confidence Intervals for Cpk
$$ \hat{C}_{pk} \pm z_{\alpha/2} \sqrt{\frac{1}{9n} + \frac{C_{pk}^2}{2(n-1)}} $$
For reliable capability estimates, need $n \geq 30$, preferably $n \geq 50$.
5. Variance Components Analysis
5.1 Typical Variance Hierarchy in Semiconductor
$$ \sigma^2_{total} = \sigma^2_{lot} + \sigma^2_{wafer(lot)} + \sigma^2_{site(wafer)} + \sigma^2_{measurement} $$
Each component represents:
- Lot-to-lot ($\sigma^2_{lot}$): Variation between production lots
- Wafer-to-wafer ($\sigma^2_{wafer}$): Variation between wafers within a lot
- Within-wafer ($\sigma^2_{site}$): Variation across measurement sites on a wafer
- Measurement ($\sigma^2_{meas}$): Gauge/metrology variation
5.2 One-Way ANOVA
Sum of Squares Decomposition
$$ SS_T = SS_B + SS_W $$
Total Sum of Squares:
$$ SS_T = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{..})^2 $$
Between-Groups Sum of Squares:
$$ SS_B = n\sum_{i=1}^{k}(\bar{x}_{i.} - \bar{x}_{..})^2 $$
Within-Groups Sum of Squares:
$$ SS_W = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{i.})^2 $$
Mean Squares
$$ \begin{aligned} MS_B &= \frac{SS_B}{k-1} \\ MS_W &= \frac{SS_W}{N-k} \end{aligned} $$
F-Statistic
$$ F = \frac{MS_B}{MS_W} \sim F_{k-1, N-k} $$
5.3 Variance Component Estimation
From mean squares:
$$ \begin{aligned} \hat{\sigma}^2_{within} &= MS_W \\ \hat{\sigma}^2_{between} &= \frac{MS_B - MS_W}{n} \end{aligned} $$
If $MS_B < MS_W$, set $\hat{\sigma}^2_{between} = 0$.
5.4 Nested (Hierarchical) ANOVA
For semiconductor's nested structure (sites within wafers within lots):
$$ x_{ijk} = \mu + \alpha_i + \beta_{j(i)} + \varepsilon_{k(ij)} $$
Where:
- $\alpha_i$ = lot effect (random)
- $\beta_{j(i)}$ = wafer effect nested within lot (random)
- $\varepsilon_{k(ij)}$ = site/measurement error
6. Measurement System Analysis (Gauge R&R)
6.1 Variance Decomposition
$$ \sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge} $$
$$ \sigma^2_{gauge} = \sigma^2_{repeatability} + \sigma^2_{reproducibility} $$
Where:
- Repeatability (Equipment Variation): Same operator, same equipment, multiple measurements
- Reproducibility (Appraiser Variation): Different operators or equipment
6.2 ANOVA Method for Gauge R&R
Two-Factor Crossed Design
| Source | SS | df | MS | EMS |
|---|---|---|---|---|
| Part (P) | $SS_P$ | $p-1$ | $MS_P$ | $\sigma^2_E + r\sigma^2_{OP} + or\sigma^2_P$ |
| Operator (O) | $SS_O$ | $o-1$ | $MS_O$ | $\sigma^2_E + r\sigma^2_{OP} + pr\sigma^2_O$ |
| P×O | $SS_{PO}$ | $(p-1)(o-1)$ | $MS_{PO}$ | $\sigma^2_E + r\sigma^2_{OP}$ |
| Error (E) | $SS_E$ | $po(r-1)$ | $MS_E$ | $\sigma^2_E$ |
Variance Component Estimates
$$ \begin{aligned} \hat{\sigma}^2_{repeatability} &= MS_E \\ \hat{\sigma}^2_{operator} &= \frac{MS_O - MS_{PO}}{pr} \\ \hat{\sigma}^2_{interaction} &= \frac{MS_{PO} - MS_E}{r} \\ \hat{\sigma}^2_{reproducibility} &= \hat{\sigma}^2_{operator} + \hat{\sigma}^2_{interaction} \\ \hat{\sigma}^2_{part} &= \frac{MS_P - MS_{PO}}{or} \end{aligned} $$
6.3 Key Metrics
Percentage of Total Variation
$$ \%GRR = 100 \times \frac{\sigma_{gauge}}{\sigma_{total}} $$
Or using study variation (5.15σ for 99%):
$$ \%GRR = 100 \times \frac{5.15 \cdot \sigma_{gauge}}{5.15 \cdot \sigma_{total}} $$
Precision-to-Tolerance Ratio (P/T)
$$ P/T = \frac{k \cdot \sigma_{gauge}}{USL - LSL} $$
Where $k = 5.15$ (99%) or $k = 6$ (99.73%).
Number of Distinct Categories (ndc)
$$ ndc = 1.41 \cdot \frac{\sigma_{part}}{\sigma_{gauge}} $$
6.4 Acceptance Criteria
| %GRR | Assessment | Action |
|---|---|---|
| < 10% | Excellent | Acceptable for all applications |
| 10–30% | Acceptable | May be acceptable depending on application |
| > 30% | Unacceptable | Measurement system needs improvement |
| ndc | Assessment |
|---|---|
| ≥ 5 | Acceptable |
| < 5 | Measurement system cannot distinguish parts |
7. Run Rules (Western Electric / Nelson Rules)
7.1 Standard Run Rules
Pattern detection beyond simple control limits:
| Rule | Pattern | Interpretation |
|---|---|---|
| 1 | 1 point beyond ±3σ | Large shift or outlier |
| 2 | 9 consecutive points on same side of CL | Small sustained shift |
| 3 | 6 consecutive points steadily increasing or decreasing | Trend/drift |
| 4 | 14 consecutive points alternating up and down | Systematic oscillation (over-adjustment) |
| 5 | 2 of 3 consecutive points beyond ±2σ (same side) | Shift warning |
| 6 | 4 of 5 consecutive points beyond ±1σ (same side) | Shift warning |
| 7 | 15 consecutive points within ±1σ | Stratification (mixture) |
| 8 | 8 consecutive points beyond ±1σ (either side) | Mixture of populations |
7.2 Zone Definitions
Control charts are divided into zones:
- Zone A: Between 2σ and 3σ from center line
- Zone B: Between 1σ and 2σ from center line
- Zone C: Within 1σ of center line
7.3 False Alarm Probabilities
| Rule | Probability (per test) |
|---|---|
| Rule 1 (±3σ) | 0.0027 |
| Rule 2 (9 same side) | 0.0039 |
| Rule 3 (6 trending) | 0.0028 |
| Rule 5 (2 of 3 in Zone A) | 0.0044 |
Combined false alarm rate increases when multiple rules are applied.
8. Average Run Length (ARL)
8.1 Definitions
- ARL₀ (In-Control ARL): Average number of samples until false alarm when process is in control (want high)
- ARL₁ (Out-of-Control ARL): Average number of samples to detect a shift (want low)
8.2 Shewhart Chart ARL
For 3σ limits:
$$ ARL_0 = \frac{1}{\alpha} = \frac{1}{0.0027} \approx 370 $$
For detecting a shift of $\delta$ standard deviations:
$$ ARL_1 = \frac{1}{P(\text{signal} | \text{shift})} $$
$$ P(\text{signal}) = 1 - \Phi(3-\delta) + \Phi(-3-\delta) $$
8.3 Comparison of Chart Performance
| Shift ($\delta\sigma$) | Shewhart ARL₁ | EWMA ARL₁ ($\lambda$=0.1) | CUSUM ARL₁ |
|---|---|---|---|
| 0.25 | 281 | 66 | 38 |
| 0.50 | 155 | 26 | 17 |
| 0.75 | 81 | 15 | 10 |
| 1.00 | 44 | 10 | 8 |
| 1.50 | 15 | 5 | 5 |
| 2.00 | 6 | 4 | 4 |
| 3.00 | 2 | 2 | 2 |
Key insight: EWMA and CUSUM are far superior for detecting small shifts ($\delta < 1.5\sigma$).
8.4 ARL Formulas for CUSUM
Approximate ARL for CUSUM detecting shift of size $\delta$:
$$ ARL_1 \approx \frac{e^{-2\Delta b} + 2\Delta b - 1}{2\Delta^2} $$
Where:
- $\Delta = \delta - k$ (excess over reference value)
- $b = h + 1.166$ (adjusted decision interval)
9. Multivariate SPC
9.1 Why Multivariate?
Semiconductor processes involve many correlated parameters. Univariate charts on correlated variables:
- Increase false alarm rates
- Miss shifts in correlated directions
- Fail to detect process changes that affect multiple parameters simultaneously
9.2 Hotelling's T² Statistic
For $p$ variables measured on a sample of size $n$:
$$ T^2 = n(\bar{\mathbf{x}} - \boldsymbol{\mu}_0)' \mathbf{S}^{-1} (\bar{\mathbf{x}} - \boldsymbol{\mu}_0) $$
Where:
- $\bar{\mathbf{x}}$ = sample mean vector ($p \times 1$)
- $\boldsymbol{\mu}_0$ = target mean vector ($p \times 1$)
- $\mathbf{S}$ = sample covariance matrix ($p \times p$)
9.3 Control Limit for T²
Phase I (establishing control):
$$ UCL = \frac{(m-1)(m+1)p}{m(m-p)} F_{\alpha, p, m-p} $$
Phase II (monitoring):
$$ UCL = \frac{p(m+1)(m-1)}{m(m-p)} F_{\alpha, p, m-p} $$
Where $m$ = number of historical samples.
For large $m$:
$$ UCL \approx \chi^2_{\alpha, p} $$
9.4 Multivariate EWMA (MEWMA)
$$ \mathbf{Z}_t = \Lambda\mathbf{X}_t + (\mathbf{I} - \Lambda)\mathbf{Z}_{t-1} $$
Where $\Lambda = \text{diag}(\lambda_1, \lambda_2, \ldots, \lambda_p)$.
Statistic:
$$ T^2_t = \mathbf{Z}_t' \boldsymbol{\Sigma}_{\mathbf{Z}_t}^{-1} \mathbf{Z}_t $$
Covariance of MEWMA:
$$ \boldsymbol{\Sigma}_{\mathbf{Z}_t} = \frac{\lambda}{2-\lambda}\left[1 - (1-\lambda)^{2t}\right]\boldsymbol{\Sigma} $$
9.5 Principal Component Analysis (PCA) for SPC
Decompose correlated variables into uncorrelated principal components:
$$ \mathbf{X} = \mathbf{T}\mathbf{P}' + \mathbf{E} $$
Where:
- $\mathbf{T}$ = scores matrix
- $\mathbf{P}$ = loadings matrix
- $\mathbf{E}$ = residuals
Hotelling's T² in PC space:
$$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$
Squared Prediction Error (SPE):
$$ SPE = \mathbf{e}'\mathbf{e} = \sum_{i=k+1}^{p} t_i^2 $$
10. Autocorrelation Handling
10.1 The Problem
Semiconductor tool data often exhibits serial correlation, violating the independence assumption of standard SPC.
Consequences of ignoring autocorrelation:
- Actual ARL₀ << 370 (excessive false alarms)
- Control limits are too tight
- Patterns in data are misinterpreted
10.2 Autocorrelation Function (ACF)
Population autocorrelation at lag $k$:
$$ \rho_k = \frac{Cov(X_t, X_{t+k})}{Var(X_t)} = \frac{\gamma_k}{\gamma_0} $$
Sample autocorrelation:
$$ r_k = \frac{\sum_{t=1}^{n-k}(x_t - \bar{x})(x_{t+k} - \bar{x})}{\sum_{t=1}^{n}(x_t - \bar{x})^2} $$
10.3 AR(1) Process
The simplest autocorrelated model:
$$ X_t = \phi X_{t-1} + \varepsilon_t $$
Where:
- $\phi$ = autoregressive parameter ($|\phi| < 1$ for stationarity)
- $\varepsilon_t \sim N(0, \sigma^2_\varepsilon)$ (white noise)
Properties:
$$ \begin{aligned} Var(X_t) &= \frac{\sigma^2_\varepsilon}{1 - \phi^2} \\ \rho_k &= \phi^k \end{aligned} $$
10.4 Solutions for Autocorrelated Data
1. Residual Charts:
- Fit time series model (AR, ARMA, etc.)
- Apply SPC to residuals $\hat{\varepsilon}_t = X_t - \hat{X}_t$
2. Modified Control Limits: $$ UCL/LCL = \mu \pm 3\sigma_X \sqrt{\frac{1 + \phi}{1 - \phi}} $$
3. EWMA with Adjusted Parameters:
- Use $\lambda = 1 - \phi$ for optimal smoothing
4. Special Cause Charts:
- Designed specifically for autocorrelated processes
11. Run-to-Run (R2R) Process Control
11.1 Basic Concept
Active feedback control layered on SPC—adjust recipe parameters based on measured outputs.
11.2 EWMA Controller
Prediction:
$$ \hat{y}_{t+1} = \lambda y_t + (1-\lambda)\hat{y}_t $$
Recipe Adjustment:
$$ u_{t+1} = u_t - G(\hat{y}_t - y_{target}) $$
Where:
- $G$ = controller gain
- $u$ = recipe parameter (e.g., etch time, dose)
- $y$ = output measurement (e.g., CD, thickness)
11.3 Double EWMA (for Drifting Processes)
Track both level and slope:
Level estimate:
$$ L_t = \lambda y_t + (1-\lambda)(L_{t-1} + T_{t-1}) $$
Trend estimate:
$$ T_t = \gamma(L_t - L_{t-1}) + (1-\gamma)T_{t-1} $$
Forecast:
$$ \hat{y}_{t+1} = L_t + T_t $$
11.4 Process Model Integration
For process with known gain $\beta$:
$$ y_t = \alpha + \beta u_t + \varepsilon_t $$
Optimal control:
$$ u_{t+1} = \frac{y_{target} - \hat{\alpha}_{t+1}}{\beta} $$
12. Yield Modeling Mathematics
12.1 Defect Density
$$ D_0 = \frac{\text{Number of defects}}{\text{Area (cm}^2\text{)}} $$
12.2 Poisson Model (Random Defects)
Assumes defects are randomly distributed:
$$ Y = e^{-D_0 A} $$
Where:
- $D_0$ = defect density (defects/cm²)
- $A$ = die area (cm²)
Probability of $k$ defects on a die:
$$ P(k) = \frac{(D_0 A)^k e^{-D_0 A}}{k!} $$
12.3 Murphy's Model (Distributed Defects)
Accounts for defect density variation across wafer:
$$ Y = \left[\frac{1 - e^{-D_0 A}}{D_0 A}\right]^2 $$
12.4 Negative Binomial Model (Clustered Defects)
More realistic for semiconductor:
$$ Y = \left(1 + \frac{D_0 A}{\alpha}\right)^{-\alpha} $$
Where $\alpha$ = clustering parameter:
- $\alpha \to \infty$: Approaches Poisson (random)
- $\alpha$ small: Highly clustered
12.5 Seeds Model
$$ Y = e^{-D_0 A_s} $$
Where $A_s$ = sensitive area (fraction of die area susceptible to defects).
12.6 Yield Loss Calculations
Defect-Limited Yield:
$$ Y_D = e^{-D_0 A} $$
Parametric Yield:
$$ Y_P = \prod_{i} P(\text{parameter}_i \text{ in spec}) $$
Total Yield:
$$ Y_{total} = Y_D \times Y_P $$
13. Spatial Statistics for Wafer Maps
13.1 Radial Uniformity
$$ \sigma_{radial} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(x_i - f(r_i))^2} $$
Where $f(r_i)$ is the fitted radial profile at radius $r_i$.
13.2 Wafer-Level Variation Components
$$ \sigma^2_{total} = \sigma^2_{W2W} + \sigma^2_{WIW} $$
Within-wafer variation often decomposed:
$$ \sigma^2_{WIW} = \sigma^2_{systematic} + \sigma^2_{random} $$
Where:
- Systematic WIW: Modeled and corrected (radial, azimuthal patterns)
- Random WIW: Inherent noise
13.3 Spatial Correlation Function
For locations $\mathbf{s}_i$ and $\mathbf{s}_j$:
$$ C(h) = Cov(X(\mathbf{s}_i), X(\mathbf{s}_j)) $$
Where $h = \|\mathbf{s}_i - \mathbf{s}_j\|$ (distance between points).
Variogram:
$$ \gamma(h) = \frac{1}{2}Var[X(\mathbf{s}_i) - X(\mathbf{s}_j)] $$
13.4 Common Wafer Signatures
Mathematical models for common spatial patterns:
Radial (bowl/dome):
$$ f(r) = a_0 + a_1 r + a_2 r^2 $$
Azimuthal:
$$ f(\theta) = b_0 + b_1 \cos(\theta) + b_2 \sin(\theta) $$
Combined:
$$ f(r, \theta) = \sum_{n,m} a_{nm} Z_n^m(r, \theta) $$
Where $Z_n^m$ are Zernike polynomials.
14. Practical Implementation Considerations
14.1 Sample Size Effects
Uncertainty in estimated standard deviation:
$$ SE(\hat{\sigma}) \approx \frac{\sigma}{\sqrt{2(n-1)}} $$
For reliable capability estimates:
- Minimum: $n \geq 30$
- Preferred: $n \geq 50$
- For critical processes: $n \geq 100$
14.2 Confidence Interval for σ
$$ \sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}} \leq \sigma \leq \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}} $$
14.3 Rational Subgrouping
Principles:
- Subgroups should capture short-term (within) variation
- Between-subgroup variation captures long-term drift
- Subgroup size $n$ typically 3–5 for continuous data
In semiconductor:
- Subgroup = wafers from same lot, run, or time window
- Site-to-site variation often treated as within-subgroup
14.4 Control Limit Estimation
Using Range Method:
$$ \hat{\sigma} = \frac{\bar{R}}{d_2} $$
Using Sample Standard Deviation:
$$ \hat{\sigma} = \frac{\bar{s}}{c_4} $$
Where $c_4$ is the unbiasing constant for standard deviation.
14.5 Short-Run SPC
For limited data (new process, low volume):
Z-MR charts using target:
$$ Z_i = \frac{x_i - T}{\sigma_0} $$
Q-charts (self-starting):
$$ Q_i = \Phi^{-1}\left(F_{i-1}\left(\frac{x_i - \bar{x}_{i-1}}{s_{i-1}\sqrt{1 + 1/(i-1)}}\right)\right) $$
15. Key Mathematical Relationships
Quick Reference Table
| Concept | Core Mathematics |
|---|---|
| Control Limits | $\mu \pm \frac{3\sigma}{\sqrt{n}}$ |
| Cp | $\frac{USL - LSL}{6\sigma}$ |
| Cpk | $\min\left[\frac{USL-\mu}{3\sigma}, \frac{\mu-LSL}{3\sigma}\right]$ |
| EWMA | $\lambda x_t + (1-\lambda)EWMA_{t-1}$ |
| CUSUM | $\max[0, x_t - (\mu_0 + K) + C_{t-1}]$ |
| Hotelling's T² | $n(\bar{\mathbf{x}}-\boldsymbol{\mu})'S^{-1}(\bar{\mathbf{x}}-\boldsymbol{\mu})$ |
| Gauge R&R | $\sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge}$ |
| Yield (Poisson) | $Y = e^{-D_0 A}$ |
| ARL₀ (3σ) | $\frac{1}{0.0027} \approx 370$ |
| AR(1) Variance | $\frac{\sigma^2_\varepsilon}{1-\phi^2}$ |
Decision Guide: Which Chart to Use?
| Situation | Recommended Chart |
|---|---|
| Standard monitoring, subgroups | X̄-R or X̄-S |
| Individual measurements | I-MR |
| Detect small shifts ($< 1.5\sigma$) | EWMA or CUSUM |
| Multiple correlated parameters | Hotelling's T² or MEWMA |
| Autocorrelated data | Residual charts or modified EWMA |
| Short production runs | Q-charts or Z-MR |
Critical Success Factors
1. Validate measurement system first (Gauge R&R < 10%) 2. Ensure rational subgrouping captures meaningful variation 3. Check for autocorrelation before applying standard charts 4. Use appropriate capability indices (Cpk vs Ppk) 5. Decompose variance to target improvement efforts 6. Match chart sensitivity to required detection speed
Control Chart Constant Tables
Constants for X̄ and R Charts
| $n$ | $A_2$ | $A_3$ | $d_2$ | $d_3$ | $D_3$ | $D_4$ | $c_4$ | $B_3$ | $B_4$ |
|---|---|---|---|---|---|---|---|---|---|
| 2 | 1.880 | 2.659 | 1.128 | 0.853 | 0 | 3.267 | 0.7979 | 0 | 3.267 |
| 3 | 1.023 | 1.954 | 1.693 | 0.888 | 0 | 2.574 | 0.8862 | 0 | 2.568 |
| 4 | 0.729 | 1.628 | 2.059 | 0.880 | 0 | 2.282 | 0.9213 | 0 | 2.266 |
| 5 | 0.577 | 1.427 | 2.326 | 0.864 | 0 | 2.114 | 0.9400 | 0 | 2.089 |
| 6 | 0.483 | 1.287 | 2.534 | 0.848 | 0 | 2.004 | 0.9515 | 0.030 | 1.970 |
| 7 | 0.419 | 1.182 | 2.704 | 0.833 | 0.076 | 1.924 | 0.9594 | 0.118 | 1.882 |
| 8 | 0.373 | 1.099 | 2.847 | 0.820 | 0.136 | 1.864 | 0.9650 | 0.185 | 1.815 |
| 9 | 0.337 | 1.032 | 2.970 | 0.808 | 0.184 | 1.816 | 0.9693 | 0.239 | 1.761 |
| 10 | 0.308 | 0.975 | 3.078 | 0.797 | 0.223 | 1.777 | 0.9727 | 0.284 | 1.716 |
Standard Normal Distribution Critical Values
| Confidence | $z_{\alpha/2}$ |
|---|---|
| 90% | 1.645 |
| 95% | 1.960 |
| 99% | 2.576 |
| 99.73% | 3.000 |
Source: ChipFoundryServices — Search this topic — Ask CFSGPT
Related Topics
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.