c-v curve,metrology
Capacitance-voltage characteristic.
9,967 technical terms and definitions
Capacitance-voltage characteristic.
Capacitance-voltage profiling determines dopant concentration depth profiles in semiconductors.
Optimization-based attack finding minimal perturbations.
Carlini & Wagner attack optimizes adversarial perturbations using specialized loss functions.
Standard for content authenticity metadata.
Early 3D CNN for video.
IBM flip-chip technology.
Categorical distributional RL.
Categorical DQN represents return distribution over discrete atoms improving performance through distributional RL.
Cache eviction policies determine which entries to remove when storage fills.
Cache hit rate measures percentage of requests served from cache rather than recomputation.
Percentage of requests served from cache.
Remove stale entries from cache.
Cache warming preloads frequently needed items improving initial response times.
Store frequent queries.
Store results to avoid recomputation.
Cache frequent LLM responses to reduce cost and latency. Use semantic similarity for cache lookup. Redis/Memcached for storage.
Create 3D CAD models.
Separate class token attention.
Perform exact arithmetic.
Calibrated recommendations ensure recommendation distributions match user preference distributions.
Match user's preference distribution.
Fit simulation parameters to match real data.
Document proving measurement accuracy.
Relationship between signal and concentration.
Calibration prompting adjusts output probabilities to reduce label bias.
Check calibration remains valid.
Calibration ensures predicted probabilities match actual outcome frequencies.
Adjust tool measurements to match known standards.
Calibration: predicted probabilities match actual frequencies. Important for decision-making.
Measuring tool for length thickness.
Sentinels indicating degradation.
Gradually roll out new model to small percentage of traffic.
Candle is Rust ML framework by Hugging Face. Fast, minimal dependencies.
Condition on edge detection.
Find correlated components.
Find correlated components across views.
Cantilever probes use spring-loaded needles in probe cards that deflect upon contact providing reliable electrical connections during wafer testing.
Bond cap for protection.
Corrective and Preventive Action systems manage nonconformance investigation and resolution.
# Semiconductor Etch Process Capability Mathematics
## 1. Fundamental Capability Indices
### 1.1 Basic Statistical Measures
- **Sample Mean ($\bar{x}$):**
$$
\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i
$$
- **Sample Standard Deviation ($s$):**
$$
s = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2}
$$
### 1.2 Process Capability (Cp)
The **potential capability** measures the process spread relative to specification width:
$$
C_p = \frac{USL - LSL}{6\sigma}
$$
Where:
- $USL$ = Upper Specification Limit
- $LSL$ = Lower Specification Limit
- $\sigma$ = Process standard deviation
**Interpretation:**
- $C_p = 1.0$ means the process $\pm 3\sigma$ exactly fills the spec window
- Higher $C_p$ indicates greater potential capability
### 1.3 Process Capability Index (Cpk)
The **actual capability** accounts for process centering:
$$
C_{pk} = \min\left(\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right)
$$
**Key relationship:**
- $C_{pk} \leq C_p$ (always)
- $C_{pk} = C_p$ only when process is perfectly centered
### 1.4 Taguchi Capability Index (Cpm)
Penalizes deviation from target $T$, not merely being within spec:
$$
C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}}
$$
### 1.5 Combined Index (Cpkm)
$$
C_{pkm} = \frac{C_{pk}}{\sqrt{1 + \left(\frac{\mu - T}{\sigma}\right)^2}}
$$
### 1.6 Industry Targets for Semiconductor Etch
| Cpk Value | Sigma Level | Defect Rate | Typical Application |
|:---------:|:-----------:|:-----------:|:-------------------:|
| 1.00 | 3σ | 2,700 ppm | Minimum acceptable |
| 1.33 | 4σ | 63 ppm | Standard processes |
| 1.67 | 5σ | 0.57 ppm | Critical dimensions |
| 2.00 | 6σ | 0.002 ppm | Advanced nodes |
## 2. Etch-Specific Uniformity Mathematics
### 2.1 Within-Wafer Uniformity (WIW)
- **Range-based method:**
$$
\%U_{WIW} = \frac{X_{max} - X_{min}}{2 \cdot \bar{X}} \times 100\%
$$
- **Standard deviation-based method (preferred):**
$$
\%U_{1\sigma} = \frac{s}{\bar{X}} \times 100\%
$$
- **Typical target:** $<1\%$ $(1\sigma)$ uniformity for etch rate
### 2.2 Wafer-to-Wafer Uniformity (WtW)
$$
\%U_{WtW} = \frac{s_{\text{wafer means}}}{\bar{X}_{\text{overall}}} \times 100\%
$$
### 2.3 Total Variance Decomposition
Via nested ANOVA:
$$
\sigma^2_{\text{total}} = \sigma^2_{WIW} + \sigma^2_{WtW} + \sigma^2_{LtL} + \sigma^2_{TtT}
$$
Where:
- $\sigma^2_{WIW}$ = Within-Wafer variance
- $\sigma^2_{WtW}$ = Wafer-to-Wafer variance
- $\sigma^2_{LtL}$ = Lot-to-Lot variance
- $\sigma^2_{TtT}$ = Tool-to-Tool (chamber-to-chamber) variance
## 3. Critical Dimension (CD) Control
### 3.1 CD Uniformity
$$
CD_{\text{uniformity}} = \frac{CD_{max} - CD_{min}}{CD_{target}} \times 100\%
$$
### 3.2 Etch Bias
$$
\text{Etch Bias} = CD_{\text{after etch}} - CD_{\text{after litho}}
$$
For anisotropic etch with undercut angle $\theta$:
$$
\Delta CD = 2 \cdot d \cdot \tan(\theta)
$$
Where:
- $d$ = etch depth
- $\theta$ = undercut angle
- For ideal anisotropic etch: $\theta = 0 \Rightarrow \Delta CD = 0$
### 3.3 Iso-Dense Bias (IDB)
$$
IDB = CD_{\text{isolated}} - CD_{\text{dense}}
$$
**Capability for IDB:**
$$
C_{pk,IDB} = \min\left(\frac{IDB_{USL} - \overline{IDB}}{3s_{IDB}}, \frac{\overline{IDB} - IDB_{LSL}}{3s_{IDB}}\right)
$$
### 3.4 Line Edge Roughness (LER) / Line Width Roughness (LWR)
- **LER Definition:**
$$
LER = 3\sigma_{\text{edge position}}
$$
- **LWR Definition:**
$$
LWR = 3\sigma_{\text{line width}}
$$
- **One-sided capability (upper limit only):**
$$
C_{pk,LER} = \frac{USL_{LER} - \overline{LER}}{3s_{LER}}
$$
## 4. Selectivity Mathematics
### 4.1 Basic Selectivity Definition
$$
\text{Selectivity} = \frac{ER_{\text{target material}}}{ER_{\text{mask or stop layer}}}
$$
### 4.2 Selectivity Capability (One-Sided)
$$
C_{pk,sel} = \frac{\overline{Sel} - LSL_{Sel}}{3s_{Sel}}
$$
**Note:** Higher selectivity is always better, so this is typically a one-sided specification.
### 4.3 Common Selectivity Requirements
| Etch Type | Material System | Typical Selectivity |
|:----------|:----------------|:-------------------:|
| SAC Etch | Oxide:Nitride | >30:1 |
| Gate Etch | Poly-Si:Oxide | >50:1 |
| Metal Etch | Al:Resist | >5:1 |
| Via Etch | Oxide:TiN | >20:1 |
## 5. Variance Component Analysis
### 5.1 Mixed-Effects Model
$$
X_{ijkl} = \mu + W_i + L_j + T_k + S_{l(ijk)} + \epsilon_{ijkl}
$$
Where:
- $\mu$ = Grand mean
- $W_i$ = Wafer random effect
- $L_j$ = Lot random effect
- $T_k$ = Tool/chamber random effect
- $S_{l(ijk)}$ = Site (within-wafer) effect
- $\epsilon_{ijkl}$ = Residual measurement error
### 5.2 Variance Component Estimation
Via REML (Restricted Maximum Likelihood):
$$
\hat{\sigma}^2_{\text{total}} = \hat{\sigma}^2_W + \hat{\sigma}^2_L + \hat{\sigma}^2_T + \hat{\sigma}^2_S + \hat{\sigma}^2_\epsilon
$$
### 5.3 Percent Contribution
$$
\%\text{Contribution}_i = \frac{\hat{\sigma}^2_i}{\hat{\sigma}^2_{\text{total}}} \times 100\%
$$
## 6. Response Surface Modeling for Etch
### 6.1 Second-Order Polynomial Model
$$
ER = \beta_0 + \sum_{i}\beta_i x_i + \sum_{i}\beta_{ii}x_i^2 + \sum_{i
Capability control limits what models can do. Prevent misuse while enabling good uses.
Capability elicitation discovers full extent of model abilities through systematic testing.
Capability with pass/fail data.
Scaling without improvement.
Capability study duration must be sufficient to capture representative variation.
Test specific linguistic capabilities.
# Semiconductor Manufacturing Process SPC: Statistical Process Control Mathematics ## 1. Introduction ### Why SPC Mathematics Matters in Semiconductor Fabs Semiconductor manufacturing operates at nanometer scales across hundreds of process steps, presenting unique challenges: - **High Value**: A single wafer can be worth $10,000 to $100,000+ - **Tight Tolerances**: Process variations of a few nanometers cause yield collapse - **Long Feedback Loops**: Days to weeks between process and measurement - **Compounding Variation**: Multiple variance sources multiply through the process flow The mathematics of SPC provides the framework to: - Detect process shifts before they cause defects - Quantify and decompose sources of variation - Maintain processes within nanometer-scale tolerances - Optimize yield through statistical understanding ## 2. Fundamental Statistical Measures ### 2.1 Descriptive Statistics For a sample of $n$ measurements $x_1, x_2, \ldots, x_n$: | Measure | Formula | Description | |---------|---------|-------------| | **Sample Mean** | $\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$ | Central tendency | | **Sample Variance** | $s^2 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}$ | Spread (unbiased) | | **Sample Std Dev** | $s = \sqrt{s^2}$ | Spread in original units | | **Range** | $R = x_{max} - x_{min}$ | Total spread | ### 2.2 The Normal (Gaussian) Distribution The mathematical backbone of classical SPC: $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) $$ Where: - $\mu$ = population mean - $\sigma$ = population standard deviation - $\sigma^2$ = population variance ### 2.3 Critical Probability Intervals | Interval | Probability Contained | Application | |----------|----------------------|-------------| | $\pm 1\sigma$ | 68.27% | Typical variation | | $\pm 2\sigma$ | 95.45% | Warning limits | | $\pm 3\sigma$ | 99.73% | Control limits | | $\pm 4\sigma$ | 99.9937% | Cpk = 1.33 | | $\pm 5\sigma$ | 99.99994% | Cpk = 1.67 | | $\pm 6\sigma$ | 99.9999998% | Six Sigma (3.4 DPMO) | ### 2.4 Standard Normal Transformation Any normal variable can be standardized: $$ Z = \frac{X - \mu}{\sigma} $$ Where $Z \sim N(0, 1)$ (standard normal distribution). ## 3. Control Chart Mathematics ### 3.1 Shewhart X̄-R Charts The workhorse of semiconductor SPC for monitoring subgroup data. #### X̄ Chart (Monitoring Process Mean) $$ \begin{aligned} CL &= \bar{\bar{X}} \quad \text{(grand mean of subgroup means)} \\ UCL &= \bar{\bar{X}} + A_2 \bar{R} \\ LCL &= \bar{\bar{X}} - A_2 \bar{R} \end{aligned} $$ **Theoretical basis:** $$ UCL / LCL = \mu \pm \frac{3\sigma}{\sqrt{n}} $$ #### R Chart (Monitoring Process Spread) $$ \begin{aligned} CL &= \bar{R} \\ UCL &= D_4 \bar{R} \\ LCL &= D_3 \bar{R} \end{aligned} $$ #### Control Chart Constants | $n$ | $A_2$ | $D_3$ | $D_4$ | $d_2$ | |-----|-------|-------|-------|-------| | 2 | 1.880 | 0 | 3.267 | 1.128 | | 3 | 1.023 | 0 | 2.574 | 1.693 | | 4 | 0.729 | 0 | 2.282 | 2.059 | | 5 | 0.577 | 0 | 2.114 | 2.326 | | 6 | 0.483 | 0 | 2.004 | 2.534 | | 7 | 0.419 | 0.076 | 1.924 | 2.704 | | 8 | 0.373 | 0.136 | 1.864 | 2.847 | | 9 | 0.337 | 0.184 | 1.816 | 2.970 | | 10 | 0.308 | 0.223 | 1.777 | 3.078 | ### 3.2 Individuals-Moving Range (I-MR) Charts Common in semiconductor when rational subgrouping isn't practical (e.g., one measurement per wafer lot). #### Individuals Chart $$ \begin{aligned} CL &= \bar{X} \\ UCL &= \bar{X} + 3 \cdot \frac{\overline{MR}}{d_2} \\ LCL &= \bar{X} - 3 \cdot \frac{\overline{MR}}{d_2} \end{aligned} $$ Where $d_2 = 1.128$ for moving range of span 2. #### Moving Range Chart $$ \begin{aligned} CL &= \overline{MR} \\ UCL &= D_4 \cdot \overline{MR} = 3.267 \cdot \overline{MR} \\ LCL &= D_3 \cdot \overline{MR} = 0 \end{aligned} $$ ### 3.3 EWMA Charts (Exponentially Weighted Moving Average) More sensitive to small, persistent shifts than Shewhart charts. #### EWMA Statistic $$ EWMA_t = \lambda x_t + (1-\lambda) EWMA_{t-1} $$ Where: - $\lambda$ = smoothing parameter ($0 < \lambda \leq 1$, typically 0.05–0.25) - $EWMA_0 = \mu_0$ (target mean) #### Control Limits (Time-Varying) $$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}\left[1-(1-\lambda)^{2t}\right]} $$ #### Asymptotic Control Limits As $t \to \infty$: $$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}} $$ **Typical parameters:** - $\lambda = 0.10$ to $0.20$ - $L = 2.5$ to $3.0$ #### EWMA Variance $$ Var(EWMA_t) = \sigma^2 \cdot \frac{\lambda}{2-\lambda} \left[1 - (1-\lambda)^{2t}\right] $$ ### 3.4 CUSUM Charts (Cumulative Sum) Accumulates deviations from target—excellent for detecting sustained shifts. #### Tabular CUSUM **Upper CUSUM (detecting upward shifts):** $$ C_t^+ = \max\left[0, x_t - (\mu_0 + K) + C_{t-1}^+\right] $$ **Lower CUSUM (detecting downward shifts):** $$ C_t^- = \max\left[0, (\mu_0 - K) - x_t + C_{t-1}^-\right] $$ **Signal condition:** $$ C_t^+ > H \quad \text{or} \quad C_t^- > H $$ Where: - $K$ = allowable slack (reference value), typically $0.5\sigma$ - $H$ = decision interval, typically $4\sigma$ to $5\sigma$ - $C_0^+ = C_0^- = 0$ #### Standardized Form For standardized observations $z_t = (x_t - \mu_0)/\sigma$: $$ \begin{aligned} S_t^+ &= \max(0, z_t - k + S_{t-1}^+) \\ S_t^- &= \max(0, -z_t - k + S_{t-1}^-) \end{aligned} $$ With $k = 0.5$ (half the shift to detect) and $h = 4$ or $5$. ## 4. Process Capability Indices ### 4.1 Cp (Potential Capability) Measures the ratio of specification width to process spread: $$ C_p = \frac{USL - LSL}{6\sigma} $$ Where: - $USL$ = Upper Specification Limit - $LSL$ = Lower Specification Limit - $\sigma$ = process standard deviation **Interpretation:** - $C_p$ does **not** account for centering - Represents potential capability if process were perfectly centered ### 4.2 Cpk (Actual Capability) Accounts for off-center processes: $$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$ **Alternative formulation:** $$ C_{pk} = C_p(1 - k) $$ Where $k = \frac{|T - \mu|}{(USL - LSL)/2}$ and $T$ is the target (specification midpoint). **Key property:** $C_{pk} \leq C_p$ always. ### 4.3 Cpm (Taguchi Capability Index) Penalizes deviation from target $T$: $$ C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}} $$ Or equivalently: $$ C_{pm} = \frac{C_p}{\sqrt{1 + \left(\frac{\mu - T}{\sigma}\right)^2}} $$ ### 4.4 Pp and Ppk (Performance Indices) Same formulas but use **overall** standard deviation (including between-subgroup variation): $$ P_p = \frac{USL - LSL}{6s_{overall}} $$ $$ P_{pk} = \min\left[\frac{USL - \bar{x}}{3s_{overall}}, \frac{\bar{x} - LSL}{3s_{overall}}\right] $$ **Relationship:** - $C_p, C_{pk}$: Use within-subgroup $\sigma$ (short-term capability) - $P_p, P_{pk}$: Use overall $s$ (long-term performance) ### 4.5 Relating Cpk to Defect Rates | $C_{pk}$ | $\sigma$-level | DPMO | Yield | |----------|----------------|------|-------| | 0.33 | 1σ | 317,311 | 68.27% | | 0.67 | 2σ | 45,500 | 95.45% | | 1.00 | 3σ | 2,700 | 99.73% | | 1.33 | 4σ | 63 | 99.9937% | | 1.67 | 5σ | 0.57 | 99.99994% | | 2.00 | 6σ | 0.002 | 99.9999998% | > **Note:** With 1.5σ shift allowance (industry standard), 6σ = 3.4 DPMO. ### 4.6 Confidence Intervals for Cpk $$ \hat{C}_{pk} \pm z_{\alpha/2} \sqrt{\frac{1}{9n} + \frac{C_{pk}^2}{2(n-1)}} $$ For reliable capability estimates, need $n \geq 30$, preferably $n \geq 50$. ## 5. Variance Components Analysis ### 5.1 Typical Variance Hierarchy in Semiconductor $$ \sigma^2_{total} = \sigma^2_{lot} + \sigma^2_{wafer(lot)} + \sigma^2_{site(wafer)} + \sigma^2_{measurement} $$ Each component represents: - **Lot-to-lot ($\sigma^2_{lot}$)**: Variation between production lots - **Wafer-to-wafer ($\sigma^2_{wafer}$)**: Variation between wafers within a lot - **Within-wafer ($\sigma^2_{site}$)**: Variation across measurement sites on a wafer - **Measurement ($\sigma^2_{meas}$)**: Gauge/metrology variation ### 5.2 One-Way ANOVA #### Sum of Squares Decomposition $$ SS_T = SS_B + SS_W $$ **Total Sum of Squares:** $$ SS_T = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{..})^2 $$ **Between-Groups Sum of Squares:** $$ SS_B = n\sum_{i=1}^{k}(\bar{x}_{i.} - \bar{x}_{..})^2 $$ **Within-Groups Sum of Squares:** $$ SS_W = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{i.})^2 $$ #### Mean Squares $$ \begin{aligned} MS_B &= \frac{SS_B}{k-1} \\ MS_W &= \frac{SS_W}{N-k} \end{aligned} $$ #### F-Statistic $$ F = \frac{MS_B}{MS_W} \sim F_{k-1, N-k} $$ ### 5.3 Variance Component Estimation From mean squares: $$ \begin{aligned} \hat{\sigma}^2_{within} &= MS_W \\ \hat{\sigma}^2_{between} &= \frac{MS_B - MS_W}{n} \end{aligned} $$ If $MS_B < MS_W$, set $\hat{\sigma}^2_{between} = 0$. ### 5.4 Nested (Hierarchical) ANOVA For semiconductor's nested structure (sites within wafers within lots): $$ x_{ijk} = \mu + \alpha_i + \beta_{j(i)} + \varepsilon_{k(ij)} $$ Where: - $\alpha_i$ = lot effect (random) - $\beta_{j(i)}$ = wafer effect nested within lot (random) - $\varepsilon_{k(ij)}$ = site/measurement error ## 6. Measurement System Analysis (Gauge R&R) ### 6.1 Variance Decomposition $$ \sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge} $$ $$ \sigma^2_{gauge} = \sigma^2_{repeatability} + \sigma^2_{reproducibility} $$ Where: - **Repeatability (Equipment Variation):** Same operator, same equipment, multiple measurements - **Reproducibility (Appraiser Variation):** Different operators or equipment ### 6.2 ANOVA Method for Gauge R&R #### Two-Factor Crossed Design | Source | SS | df | MS | EMS | |--------|----|----|----|----| | Part (P) | $SS_P$ | $p-1$ | $MS_P$ | $\sigma^2_E + r\sigma^2_{OP} + or\sigma^2_P$ | | Operator (O) | $SS_O$ | $o-1$ | $MS_O$ | $\sigma^2_E + r\sigma^2_{OP} + pr\sigma^2_O$ | | P×O | $SS_{PO}$ | $(p-1)(o-1)$ | $MS_{PO}$ | $\sigma^2_E + r\sigma^2_{OP}$ | | Error (E) | $SS_E$ | $po(r-1)$ | $MS_E$ | $\sigma^2_E$ | #### Variance Component Estimates $$ \begin{aligned} \hat{\sigma}^2_{repeatability} &= MS_E \\ \hat{\sigma}^2_{operator} &= \frac{MS_O - MS_{PO}}{pr} \\ \hat{\sigma}^2_{interaction} &= \frac{MS_{PO} - MS_E}{r} \\ \hat{\sigma}^2_{reproducibility} &= \hat{\sigma}^2_{operator} + \hat{\sigma}^2_{interaction} \\ \hat{\sigma}^2_{part} &= \frac{MS_P - MS_{PO}}{or} \end{aligned} $$ ### 6.3 Key Metrics #### Percentage of Total Variation $$ \%GRR = 100 \times \frac{\sigma_{gauge}}{\sigma_{total}} $$ Or using study variation (5.15σ for 99%): $$ \%GRR = 100 \times \frac{5.15 \cdot \sigma_{gauge}}{5.15 \cdot \sigma_{total}} $$ #### Precision-to-Tolerance Ratio (P/T) $$ P/T = \frac{k \cdot \sigma_{gauge}}{USL - LSL} $$ Where $k = 5.15$ (99%) or $k = 6$ (99.73%). #### Number of Distinct Categories (ndc) $$ ndc = 1.41 \cdot \frac{\sigma_{part}}{\sigma_{gauge}} $$ ### 6.4 Acceptance Criteria | %GRR | Assessment | Action | |------|------------|--------| | < 10% | Excellent | Acceptable for all applications | | 10–30% | Acceptable | May be acceptable depending on application | | > 30% | Unacceptable | Measurement system needs improvement | | ndc | Assessment | |-----|------------| | ≥ 5 | Acceptable | | < 5 | Measurement system cannot distinguish parts | ## 7. Run Rules (Western Electric / Nelson Rules) ### 7.1 Standard Run Rules Pattern detection beyond simple control limits: | Rule | Pattern | Interpretation | |------|---------|----------------| | **1** | 1 point beyond ±3σ | Large shift or outlier | | **2** | 9 consecutive points on same side of CL | Small sustained shift | | **3** | 6 consecutive points steadily increasing or decreasing | Trend/drift | | **4** | 14 consecutive points alternating up and down | Systematic oscillation (over-adjustment) | | **5** | 2 of 3 consecutive points beyond ±2σ (same side) | Shift warning | | **6** | 4 of 5 consecutive points beyond ±1σ (same side) | Shift warning | | **7** | 15 consecutive points within ±1σ | Stratification (mixture) | | **8** | 8 consecutive points beyond ±1σ (either side) | Mixture of populations | ### 7.2 Zone Definitions Control charts are divided into zones: - **Zone A:** Between 2σ and 3σ from center line - **Zone B:** Between 1σ and 2σ from center line - **Zone C:** Within 1σ of center line ### 7.3 False Alarm Probabilities | Rule | Probability (per test) | |------|----------------------| | Rule 1 (±3σ) | 0.0027 | | Rule 2 (9 same side) | 0.0039 | | Rule 3 (6 trending) | 0.0028 | | Rule 5 (2 of 3 in Zone A) | 0.0044 | **Combined false alarm rate** increases when multiple rules are applied. ## 8. Average Run Length (ARL) ### 8.1 Definitions - **ARL₀ (In-Control ARL):** Average number of samples until false alarm when process is in control (want high) - **ARL₁ (Out-of-Control ARL):** Average number of samples to detect a shift (want low) ### 8.2 Shewhart Chart ARL For 3σ limits: $$ ARL_0 = \frac{1}{\alpha} = \frac{1}{0.0027} \approx 370 $$ For detecting a shift of $\delta$ standard deviations: $$ ARL_1 = \frac{1}{P(\text{signal} | \text{shift})} $$ $$ P(\text{signal}) = 1 - \Phi(3-\delta) + \Phi(-3-\delta) $$ ### 8.3 Comparison of Chart Performance | Shift ($\delta\sigma$) | Shewhart ARL₁ | EWMA ARL₁ ($\lambda$=0.1) | CUSUM ARL₁ | |------------------------|---------------|---------------------------|------------| | 0.25 | 281 | 66 | 38 | | 0.50 | 155 | 26 | 17 | | 0.75 | 81 | 15 | 10 | | 1.00 | 44 | 10 | 8 | | 1.50 | 15 | 5 | 5 | | 2.00 | 6 | 4 | 4 | | 3.00 | 2 | 2 | 2 | **Key insight:** EWMA and CUSUM are far superior for detecting small shifts ($\delta < 1.5\sigma$). ### 8.4 ARL Formulas for CUSUM Approximate ARL for CUSUM detecting shift of size $\delta$: $$ ARL_1 \approx \frac{e^{-2\Delta b} + 2\Delta b - 1}{2\Delta^2} $$ Where: - $\Delta = \delta - k$ (excess over reference value) - $b = h + 1.166$ (adjusted decision interval) ## 9. Multivariate SPC ### 9.1 Why Multivariate? Semiconductor processes involve many correlated parameters. Univariate charts on correlated variables: - Increase false alarm rates - Miss shifts in correlated directions - Fail to detect process changes that affect multiple parameters simultaneously ### 9.2 Hotelling's T² Statistic For $p$ variables measured on a sample of size $n$: $$ T^2 = n(\bar{\mathbf{x}} - \boldsymbol{\mu}_0)' \mathbf{S}^{-1} (\bar{\mathbf{x}} - \boldsymbol{\mu}_0) $$ Where: - $\bar{\mathbf{x}}$ = sample mean vector ($p \times 1$) - $\boldsymbol{\mu}_0$ = target mean vector ($p \times 1$) - $\mathbf{S}$ = sample covariance matrix ($p \times p$) ### 9.3 Control Limit for T² **Phase I (establishing control):** $$ UCL = \frac{(m-1)(m+1)p}{m(m-p)} F_{\alpha, p, m-p} $$ **Phase II (monitoring):** $$ UCL = \frac{p(m+1)(m-1)}{m(m-p)} F_{\alpha, p, m-p} $$ Where $m$ = number of historical samples. For large $m$: $$ UCL \approx \chi^2_{\alpha, p} $$ ### 9.4 Multivariate EWMA (MEWMA) $$ \mathbf{Z}_t = \Lambda\mathbf{X}_t + (\mathbf{I} - \Lambda)\mathbf{Z}_{t-1} $$ Where $\Lambda = \text{diag}(\lambda_1, \lambda_2, \ldots, \lambda_p)$. **Statistic:** $$ T^2_t = \mathbf{Z}_t' \boldsymbol{\Sigma}_{\mathbf{Z}_t}^{-1} \mathbf{Z}_t $$ **Covariance of MEWMA:** $$ \boldsymbol{\Sigma}_{\mathbf{Z}_t} = \frac{\lambda}{2-\lambda}\left[1 - (1-\lambda)^{2t}\right]\boldsymbol{\Sigma} $$ ### 9.5 Principal Component Analysis (PCA) for SPC Decompose correlated variables into uncorrelated principal components: $$ \mathbf{X} = \mathbf{T}\mathbf{P}' + \mathbf{E} $$ Where: - $\mathbf{T}$ = scores matrix - $\mathbf{P}$ = loadings matrix - $\mathbf{E}$ = residuals **Hotelling's T² in PC space:** $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$ **Squared Prediction Error (SPE):** $$ SPE = \mathbf{e}'\mathbf{e} = \sum_{i=k+1}^{p} t_i^2 $$ ## 10. Autocorrelation Handling ### 10.1 The Problem Semiconductor tool data often exhibits serial correlation, violating the independence assumption of standard SPC. **Consequences of ignoring autocorrelation:** - Actual ARL₀ << 370 (excessive false alarms) - Control limits are too tight - Patterns in data are misinterpreted ### 10.2 Autocorrelation Function (ACF) **Population autocorrelation at lag $k$:** $$ \rho_k = \frac{Cov(X_t, X_{t+k})}{Var(X_t)} = \frac{\gamma_k}{\gamma_0} $$ **Sample autocorrelation:** $$ r_k = \frac{\sum_{t=1}^{n-k}(x_t - \bar{x})(x_{t+k} - \bar{x})}{\sum_{t=1}^{n}(x_t - \bar{x})^2} $$ ### 10.3 AR(1) Process The simplest autocorrelated model: $$ X_t = \phi X_{t-1} + \varepsilon_t $$ Where: - $\phi$ = autoregressive parameter ($|\phi| < 1$ for stationarity) - $\varepsilon_t \sim N(0, \sigma^2_\varepsilon)$ (white noise) **Properties:** $$ \begin{aligned} Var(X_t) &= \frac{\sigma^2_\varepsilon}{1 - \phi^2} \\ \rho_k &= \phi^k \end{aligned} $$ ### 10.4 Solutions for Autocorrelated Data 1. **Residual Charts:** - Fit time series model (AR, ARMA, etc.) - Apply SPC to residuals $\hat{\varepsilon}_t = X_t - \hat{X}_t$ 2. **Modified Control Limits:** $$ UCL/LCL = \mu \pm 3\sigma_X \sqrt{\frac{1 + \phi}{1 - \phi}} $$ 3. **EWMA with Adjusted Parameters:** - Use $\lambda = 1 - \phi$ for optimal smoothing 4. **Special Cause Charts:** - Designed specifically for autocorrelated processes ## 11. Run-to-Run (R2R) Process Control ### 11.1 Basic Concept Active feedback control layered on SPC—adjust recipe parameters based on measured outputs. ### 11.2 EWMA Controller **Prediction:** $$ \hat{y}_{t+1} = \lambda y_t + (1-\lambda)\hat{y}_t $$ **Recipe Adjustment:** $$ u_{t+1} = u_t - G(\hat{y}_t - y_{target}) $$ Where: - $G$ = controller gain - $u$ = recipe parameter (e.g., etch time, dose) - $y$ = output measurement (e.g., CD, thickness) ### 11.3 Double EWMA (for Drifting Processes) Track both level and slope: **Level estimate:** $$ L_t = \lambda y_t + (1-\lambda)(L_{t-1} + T_{t-1}) $$ **Trend estimate:** $$ T_t = \gamma(L_t - L_{t-1}) + (1-\gamma)T_{t-1} $$ **Forecast:** $$ \hat{y}_{t+1} = L_t + T_t $$ ### 11.4 Process Model Integration For process with known gain $\beta$: $$ y_t = \alpha + \beta u_t + \varepsilon_t $$ **Optimal control:** $$ u_{t+1} = \frac{y_{target} - \hat{\alpha}_{t+1}}{\beta} $$ ## 12. Yield Modeling Mathematics ### 12.1 Defect Density $$ D_0 = \frac{\text{Number of defects}}{\text{Area (cm}^2\text{)}} $$ ### 12.2 Poisson Model (Random Defects) Assumes defects are randomly distributed: $$ Y = e^{-D_0 A} $$ Where: - $D_0$ = defect density (defects/cm²) - $A$ = die area (cm²) **Probability of $k$ defects on a die:** $$ P(k) = \frac{(D_0 A)^k e^{-D_0 A}}{k!} $$ ### 12.3 Murphy's Model (Distributed Defects) Accounts for defect density variation across wafer: $$ Y = \left[\frac{1 - e^{-D_0 A}}{D_0 A}\right]^2 $$ ### 12.4 Negative Binomial Model (Clustered Defects) More realistic for semiconductor: $$ Y = \left(1 + \frac{D_0 A}{\alpha}\right)^{-\alpha} $$ Where $\alpha$ = clustering parameter: - $\alpha \to \infty$: Approaches Poisson (random) - $\alpha$ small: Highly clustered ### 12.5 Seeds Model $$ Y = e^{-D_0 A_s} $$ Where $A_s$ = sensitive area (fraction of die area susceptible to defects). ### 12.6 Yield Loss Calculations **Defect-Limited Yield:** $$ Y_D = e^{-D_0 A} $$ **Parametric Yield:** $$ Y_P = \prod_{i} P(\text{parameter}_i \text{ in spec}) $$ **Total Yield:** $$ Y_{total} = Y_D \times Y_P $$ ## 13. Spatial Statistics for Wafer Maps ### 13.1 Radial Uniformity $$ \sigma_{radial} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(x_i - f(r_i))^2} $$ Where $f(r_i)$ is the fitted radial profile at radius $r_i$. ### 13.2 Wafer-Level Variation Components $$ \sigma^2_{total} = \sigma^2_{W2W} + \sigma^2_{WIW} $$ Within-wafer variation often decomposed: $$ \sigma^2_{WIW} = \sigma^2_{systematic} + \sigma^2_{random} $$ Where: - **Systematic WIW:** Modeled and corrected (radial, azimuthal patterns) - **Random WIW:** Inherent noise ### 13.3 Spatial Correlation Function For locations $\mathbf{s}_i$ and $\mathbf{s}_j$: $$ C(h) = Cov(X(\mathbf{s}_i), X(\mathbf{s}_j)) $$ Where $h = \|\mathbf{s}_i - \mathbf{s}_j\|$ (distance between points). **Variogram:** $$ \gamma(h) = \frac{1}{2}Var[X(\mathbf{s}_i) - X(\mathbf{s}_j)] $$ ### 13.4 Common Wafer Signatures Mathematical models for common spatial patterns: **Radial (bowl/dome):** $$ f(r) = a_0 + a_1 r + a_2 r^2 $$ **Azimuthal:** $$ f(\theta) = b_0 + b_1 \cos(\theta) + b_2 \sin(\theta) $$ **Combined:** $$ f(r, \theta) = \sum_{n,m} a_{nm} Z_n^m(r, \theta) $$ Where $Z_n^m$ are Zernike polynomials. ## 14. Practical Implementation Considerations ### 14.1 Sample Size Effects Uncertainty in estimated standard deviation: $$ SE(\hat{\sigma}) \approx \frac{\sigma}{\sqrt{2(n-1)}} $$ **For reliable capability estimates:** - Minimum: $n \geq 30$ - Preferred: $n \geq 50$ - For critical processes: $n \geq 100$ ### 14.2 Confidence Interval for σ $$ \sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}} \leq \sigma \leq \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}} $$ ### 14.3 Rational Subgrouping **Principles:** - Subgroups should capture short-term (within) variation - Between-subgroup variation captures long-term drift - Subgroup size $n$ typically 3–5 for continuous data **In semiconductor:** - Subgroup = wafers from same lot, run, or time window - Site-to-site variation often treated as within-subgroup ### 14.4 Control Limit Estimation **Using Range Method:** $$ \hat{\sigma} = \frac{\bar{R}}{d_2} $$ **Using Sample Standard Deviation:** $$ \hat{\sigma} = \frac{\bar{s}}{c_4} $$ Where $c_4$ is the unbiasing constant for standard deviation. ### 14.5 Short-Run SPC For limited data (new process, low volume): **Z-MR charts using target:** $$ Z_i = \frac{x_i - T}{\sigma_0} $$ **Q-charts (self-starting):** $$ Q_i = \Phi^{-1}\left(F_{i-1}\left(\frac{x_i - \bar{x}_{i-1}}{s_{i-1}\sqrt{1 + 1/(i-1)}}\right)\right) $$ ## 15. Key Mathematical Relationships ### Quick Reference Table | Concept | Core Mathematics | |---------|------------------| | **Control Limits** | $\mu \pm \frac{3\sigma}{\sqrt{n}}$ | | **Cp** | $\frac{USL - LSL}{6\sigma}$ | | **Cpk** | $\min\left[\frac{USL-\mu}{3\sigma}, \frac{\mu-LSL}{3\sigma}\right]$ | | **EWMA** | $\lambda x_t + (1-\lambda)EWMA_{t-1}$ | | **CUSUM** | $\max[0, x_t - (\mu_0 + K) + C_{t-1}]$ | | **Hotelling's T²** | $n(\bar{\mathbf{x}}-\boldsymbol{\mu})'S^{-1}(\bar{\mathbf{x}}-\boldsymbol{\mu})$ | | **Gauge R&R** | $\sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge}$ | | **Yield (Poisson)** | $Y = e^{-D_0 A}$ | | **ARL₀ (3σ)** | $\frac{1}{0.0027} \approx 370$ | | **AR(1) Variance** | $\frac{\sigma^2_\varepsilon}{1-\phi^2}$ | ### Decision Guide: Which Chart to Use? | Situation | Recommended Chart | |-----------|------------------| | Standard monitoring, subgroups | X̄-R or X̄-S | | Individual measurements | I-MR | | Detect small shifts ($< 1.5\sigma$) | EWMA or CUSUM | | Multiple correlated parameters | Hotelling's T² or MEWMA | | Autocorrelated data | Residual charts or modified EWMA | | Short production runs | Q-charts or Z-MR | ### Critical Success Factors 1. **Validate measurement system first** (Gauge R&R < 10%) 2. **Ensure rational subgrouping** captures meaningful variation 3. **Check for autocorrelation** before applying standard charts 4. **Use appropriate capability indices** (Cpk vs Ppk) 5. **Decompose variance** to target improvement efforts 6. **Match chart sensitivity** to required detection speed ## Control Chart Constant Tables ### Constants for X̄ and R Charts | $n$ | $A_2$ | $A_3$ | $d_2$ | $d_3$ | $D_3$ | $D_4$ | $c_4$ | $B_3$ | $B_4$ | |-----|-------|-------|-------|-------|-------|-------|-------|-------|-------| | 2 | 1.880 | 2.659 | 1.128 | 0.853 | 0 | 3.267 | 0.7979 | 0 | 3.267 | | 3 | 1.023 | 1.954 | 1.693 | 0.888 | 0 | 2.574 | 0.8862 | 0 | 2.568 | | 4 | 0.729 | 1.628 | 2.059 | 0.880 | 0 | 2.282 | 0.9213 | 0 | 2.266 | | 5 | 0.577 | 1.427 | 2.326 | 0.864 | 0 | 2.114 | 0.9400 | 0 | 2.089 | | 6 | 0.483 | 1.287 | 2.534 | 0.848 | 0 | 2.004 | 0.9515 | 0.030 | 1.970 | | 7 | 0.419 | 1.182 | 2.704 | 0.833 | 0.076 | 1.924 | 0.9594 | 0.118 | 1.882 | | 8 | 0.373 | 1.099 | 2.847 | 0.820 | 0.136 | 1.864 | 0.9650 | 0.185 | 1.815 | | 9 | 0.337 | 1.032 | 2.970 | 0.808 | 0.184 | 1.816 | 0.9693 | 0.239 | 1.761 | | 10 | 0.308 | 0.975 | 3.078 | 0.797 | 0.223 | 1.777 | 0.9727 | 0.284 | 1.716 | ### Standard Normal Distribution Critical Values | Confidence | $z_{\alpha/2}$ | |------------|----------------| | 90% | 1.645 | | 95% | 1.960 | | 99% | 2.576 | | 99.73% | 3.000 |
Measure electrical properties of dielectrics and junctions.
Extract doping profile from C-V measurement.