← Back to AI Factory Chat

AI Factory Glossary

9,967 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 165 of 200 (9,967 entries)

sourcegraph cody,code,context

Cody is Sourcegraph AI with codebase context. Understands your code.

space charge region, device physics

Region depleted of mobile carriers.

space-filling designs, doe

Cover parameter space evenly.

spacer formation, process integration

Spacer formation deposits and etches insulating material beside gate structures controlling overlap and junction profiles during source-drain implantation.

span boundary objective, nlp

Predict span boundaries.

span masking, nlp

Mask contiguous spans of tokens.

span-based parsing, structured prediction

Span-based parsing predicts constituency trees by scoring all possible spans and their labels using neural representations.

spare parts inventory,operations

Stock of replacement components.

spare parts, manufacturing operations

Spare parts inventory supports rapid equipment repair reducing downtime.

spark,big data,distributed

Apache Spark processes big data distributed. PySpark for Python. ML pipelines at scale.

sparse attention,transformer

Attention pattern where each token only attends to a subset of tokens rather than all.

sparse autoencoder,feature,decompose

Sparse autoencoders decompose activations into interpretable features. Scale to large models.

sparse autoencoder,sae,features

Sparse autoencoders find interpretable features in model activations. Tool for understanding what models learn.

sparse autoencoders for interpretability, explainable ai

Extract interpretable features.

sparse mapping, robotics

Use sparse feature points.

sparse mixture, llm architecture

Sparse mixture of experts activates subset of experts per input.

sparse model,model architecture

Model where only subset of parameters activate (MoE).

sparse retrieval, rag

Use keywords (BM25 TF-IDF).

sparse retrieval, rag

Sparse retrieval uses term matching like BM25 for document retrieval.

sparse training, model optimization

Sparse training maintains sparsity throughout training never materializing dense networks.

sparse transformer patterns, transformer

Different sparsity patterns for attention.

sparse upcycling,model architecture

Convert dense model to MoE by splitting weights into experts.

sparse weight averaging, model optimization

Sparse weight averaging combines multiple sparse models improving generalization.

sparse-to-sparse training,model training

Maintain sparsity throughout training.

sparsity,pruning,zero

Sparsity sets weights to zero. Structured or unstructured pruning. Reduces compute if hardware supports.

spatial attention, model optimization

Spatial attention focuses on informative regions by weighting spatial locations.

spatial autocorrelation, manufacturing operations

Spatial autocorrelation quantifies similarity between nearby die locations.

spatial correlation in yield, manufacturing

Yield patterns across wafer.

spatial correlation, yield enhancement

Spatial correlation in yield data indicates clustering of failures suggesting common root causes.

spatial monitoring, spc

Monitor location-dependent data.

spatial reasoning in vision, computer vision

Understand spatial relationships.

spatial reasoning,reasoning

Understand spatial relationships.

spatial signature, advanced test & probe

Spatial signatures on wafer maps reveal systematic defects related to process equipment or materials.

spatial signature,metrology

Pattern of failures on wafer.

spatiotemporal detection,computer vision

Detect objects in space and time.

spc capability,process capability spc

Ability of process to meet specifications.

spc process capability,capability spc

Measure process ability to meet specs.

spc, statistical process control, control chart, shewhart, cpk, process capability, ewma, cusum, gauge r&r, run rules, process control

# Semiconductor Manufacturing Process SPC: Statistical Process Control Mathematics ## 1. Introduction ### Why SPC Mathematics Matters in Semiconductor Fabs Semiconductor manufacturing operates at nanometer scales across hundreds of process steps, presenting unique challenges: - **High Value**: A single wafer can be worth $10,000 to $100,000+ - **Tight Tolerances**: Process variations of a few nanometers cause yield collapse - **Long Feedback Loops**: Days to weeks between process and measurement - **Compounding Variation**: Multiple variance sources multiply through the process flow The mathematics of SPC provides the framework to: - Detect process shifts before they cause defects - Quantify and decompose sources of variation - Maintain processes within nanometer-scale tolerances - Optimize yield through statistical understanding ## 2. Fundamental Statistical Measures ### 2.1 Descriptive Statistics For a sample of $n$ measurements $x_1, x_2, \ldots, x_n$: | Measure | Formula | Description | |---------|---------|-------------| | **Sample Mean** | $\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i$ | Central tendency | | **Sample Variance** | $s^2 = \frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}$ | Spread (unbiased) | | **Sample Std Dev** | $s = \sqrt{s^2}$ | Spread in original units | | **Range** | $R = x_{max} - x_{min}$ | Total spread | ### 2.2 The Normal (Gaussian) Distribution The mathematical backbone of classical SPC: $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right) $$ Where: - $\mu$ = population mean - $\sigma$ = population standard deviation - $\sigma^2$ = population variance ### 2.3 Critical Probability Intervals | Interval | Probability Contained | Application | |----------|----------------------|-------------| | $\pm 1\sigma$ | 68.27% | Typical variation | | $\pm 2\sigma$ | 95.45% | Warning limits | | $\pm 3\sigma$ | 99.73% | Control limits | | $\pm 4\sigma$ | 99.9937% | Cpk = 1.33 | | $\pm 5\sigma$ | 99.99994% | Cpk = 1.67 | | $\pm 6\sigma$ | 99.9999998% | Six Sigma (3.4 DPMO) | ### 2.4 Standard Normal Transformation Any normal variable can be standardized: $$ Z = \frac{X - \mu}{\sigma} $$ Where $Z \sim N(0, 1)$ (standard normal distribution). ## 3. Control Chart Mathematics ### 3.1 Shewhart X̄-R Charts The workhorse of semiconductor SPC for monitoring subgroup data. #### X̄ Chart (Monitoring Process Mean) $$ \begin{aligned} CL &= \bar{\bar{X}} \quad \text{(grand mean of subgroup means)} \\ UCL &= \bar{\bar{X}} + A_2 \bar{R} \\ LCL &= \bar{\bar{X}} - A_2 \bar{R} \end{aligned} $$ **Theoretical basis:** $$ UCL / LCL = \mu \pm \frac{3\sigma}{\sqrt{n}} $$ #### R Chart (Monitoring Process Spread) $$ \begin{aligned} CL &= \bar{R} \\ UCL &= D_4 \bar{R} \\ LCL &= D_3 \bar{R} \end{aligned} $$ #### Control Chart Constants | $n$ | $A_2$ | $D_3$ | $D_4$ | $d_2$ | |-----|-------|-------|-------|-------| | 2 | 1.880 | 0 | 3.267 | 1.128 | | 3 | 1.023 | 0 | 2.574 | 1.693 | | 4 | 0.729 | 0 | 2.282 | 2.059 | | 5 | 0.577 | 0 | 2.114 | 2.326 | | 6 | 0.483 | 0 | 2.004 | 2.534 | | 7 | 0.419 | 0.076 | 1.924 | 2.704 | | 8 | 0.373 | 0.136 | 1.864 | 2.847 | | 9 | 0.337 | 0.184 | 1.816 | 2.970 | | 10 | 0.308 | 0.223 | 1.777 | 3.078 | ### 3.2 Individuals-Moving Range (I-MR) Charts Common in semiconductor when rational subgrouping isn't practical (e.g., one measurement per wafer lot). #### Individuals Chart $$ \begin{aligned} CL &= \bar{X} \\ UCL &= \bar{X} + 3 \cdot \frac{\overline{MR}}{d_2} \\ LCL &= \bar{X} - 3 \cdot \frac{\overline{MR}}{d_2} \end{aligned} $$ Where $d_2 = 1.128$ for moving range of span 2. #### Moving Range Chart $$ \begin{aligned} CL &= \overline{MR} \\ UCL &= D_4 \cdot \overline{MR} = 3.267 \cdot \overline{MR} \\ LCL &= D_3 \cdot \overline{MR} = 0 \end{aligned} $$ ### 3.3 EWMA Charts (Exponentially Weighted Moving Average) More sensitive to small, persistent shifts than Shewhart charts. #### EWMA Statistic $$ EWMA_t = \lambda x_t + (1-\lambda) EWMA_{t-1} $$ Where: - $\lambda$ = smoothing parameter ($0 < \lambda \leq 1$, typically 0.05–0.25) - $EWMA_0 = \mu_0$ (target mean) #### Control Limits (Time-Varying) $$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}\left[1-(1-\lambda)^{2t}\right]} $$ #### Asymptotic Control Limits As $t \to \infty$: $$ UCL/LCL = \mu_0 \pm L\sigma\sqrt{\frac{\lambda}{2-\lambda}} $$ **Typical parameters:** - $\lambda = 0.10$ to $0.20$ - $L = 2.5$ to $3.0$ #### EWMA Variance $$ Var(EWMA_t) = \sigma^2 \cdot \frac{\lambda}{2-\lambda} \left[1 - (1-\lambda)^{2t}\right] $$ ### 3.4 CUSUM Charts (Cumulative Sum) Accumulates deviations from target—excellent for detecting sustained shifts. #### Tabular CUSUM **Upper CUSUM (detecting upward shifts):** $$ C_t^+ = \max\left[0, x_t - (\mu_0 + K) + C_{t-1}^+\right] $$ **Lower CUSUM (detecting downward shifts):** $$ C_t^- = \max\left[0, (\mu_0 - K) - x_t + C_{t-1}^-\right] $$ **Signal condition:** $$ C_t^+ > H \quad \text{or} \quad C_t^- > H $$ Where: - $K$ = allowable slack (reference value), typically $0.5\sigma$ - $H$ = decision interval, typically $4\sigma$ to $5\sigma$ - $C_0^+ = C_0^- = 0$ #### Standardized Form For standardized observations $z_t = (x_t - \mu_0)/\sigma$: $$ \begin{aligned} S_t^+ &= \max(0, z_t - k + S_{t-1}^+) \\ S_t^- &= \max(0, -z_t - k + S_{t-1}^-) \end{aligned} $$ With $k = 0.5$ (half the shift to detect) and $h = 4$ or $5$. ## 4. Process Capability Indices ### 4.1 Cp (Potential Capability) Measures the ratio of specification width to process spread: $$ C_p = \frac{USL - LSL}{6\sigma} $$ Where: - $USL$ = Upper Specification Limit - $LSL$ = Lower Specification Limit - $\sigma$ = process standard deviation **Interpretation:** - $C_p$ does **not** account for centering - Represents potential capability if process were perfectly centered ### 4.2 Cpk (Actual Capability) Accounts for off-center processes: $$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$ **Alternative formulation:** $$ C_{pk} = C_p(1 - k) $$ Where $k = \frac{|T - \mu|}{(USL - LSL)/2}$ and $T$ is the target (specification midpoint). **Key property:** $C_{pk} \leq C_p$ always. ### 4.3 Cpm (Taguchi Capability Index) Penalizes deviation from target $T$: $$ C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}} $$ Or equivalently: $$ C_{pm} = \frac{C_p}{\sqrt{1 + \left(\frac{\mu - T}{\sigma}\right)^2}} $$ ### 4.4 Pp and Ppk (Performance Indices) Same formulas but use **overall** standard deviation (including between-subgroup variation): $$ P_p = \frac{USL - LSL}{6s_{overall}} $$ $$ P_{pk} = \min\left[\frac{USL - \bar{x}}{3s_{overall}}, \frac{\bar{x} - LSL}{3s_{overall}}\right] $$ **Relationship:** - $C_p, C_{pk}$: Use within-subgroup $\sigma$ (short-term capability) - $P_p, P_{pk}$: Use overall $s$ (long-term performance) ### 4.5 Relating Cpk to Defect Rates | $C_{pk}$ | $\sigma$-level | DPMO | Yield | |----------|----------------|------|-------| | 0.33 | 1σ | 317,311 | 68.27% | | 0.67 | 2σ | 45,500 | 95.45% | | 1.00 | 3σ | 2,700 | 99.73% | | 1.33 | 4σ | 63 | 99.9937% | | 1.67 | 5σ | 0.57 | 99.99994% | | 2.00 | 6σ | 0.002 | 99.9999998% | > **Note:** With 1.5σ shift allowance (industry standard), 6σ = 3.4 DPMO. ### 4.6 Confidence Intervals for Cpk $$ \hat{C}_{pk} \pm z_{\alpha/2} \sqrt{\frac{1}{9n} + \frac{C_{pk}^2}{2(n-1)}} $$ For reliable capability estimates, need $n \geq 30$, preferably $n \geq 50$. ## 5. Variance Components Analysis ### 5.1 Typical Variance Hierarchy in Semiconductor $$ \sigma^2_{total} = \sigma^2_{lot} + \sigma^2_{wafer(lot)} + \sigma^2_{site(wafer)} + \sigma^2_{measurement} $$ Each component represents: - **Lot-to-lot ($\sigma^2_{lot}$)**: Variation between production lots - **Wafer-to-wafer ($\sigma^2_{wafer}$)**: Variation between wafers within a lot - **Within-wafer ($\sigma^2_{site}$)**: Variation across measurement sites on a wafer - **Measurement ($\sigma^2_{meas}$)**: Gauge/metrology variation ### 5.2 One-Way ANOVA #### Sum of Squares Decomposition $$ SS_T = SS_B + SS_W $$ **Total Sum of Squares:** $$ SS_T = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{..})^2 $$ **Between-Groups Sum of Squares:** $$ SS_B = n\sum_{i=1}^{k}(\bar{x}_{i.} - \bar{x}_{..})^2 $$ **Within-Groups Sum of Squares:** $$ SS_W = \sum_{i=1}^{k}\sum_{j=1}^{n}(x_{ij} - \bar{x}_{i.})^2 $$ #### Mean Squares $$ \begin{aligned} MS_B &= \frac{SS_B}{k-1} \\ MS_W &= \frac{SS_W}{N-k} \end{aligned} $$ #### F-Statistic $$ F = \frac{MS_B}{MS_W} \sim F_{k-1, N-k} $$ ### 5.3 Variance Component Estimation From mean squares: $$ \begin{aligned} \hat{\sigma}^2_{within} &= MS_W \\ \hat{\sigma}^2_{between} &= \frac{MS_B - MS_W}{n} \end{aligned} $$ If $MS_B < MS_W$, set $\hat{\sigma}^2_{between} = 0$. ### 5.4 Nested (Hierarchical) ANOVA For semiconductor's nested structure (sites within wafers within lots): $$ x_{ijk} = \mu + \alpha_i + \beta_{j(i)} + \varepsilon_{k(ij)} $$ Where: - $\alpha_i$ = lot effect (random) - $\beta_{j(i)}$ = wafer effect nested within lot (random) - $\varepsilon_{k(ij)}$ = site/measurement error ## 6. Measurement System Analysis (Gauge R&R) ### 6.1 Variance Decomposition $$ \sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge} $$ $$ \sigma^2_{gauge} = \sigma^2_{repeatability} + \sigma^2_{reproducibility} $$ Where: - **Repeatability (Equipment Variation):** Same operator, same equipment, multiple measurements - **Reproducibility (Appraiser Variation):** Different operators or equipment ### 6.2 ANOVA Method for Gauge R&R #### Two-Factor Crossed Design | Source | SS | df | MS | EMS | |--------|----|----|----|----| | Part (P) | $SS_P$ | $p-1$ | $MS_P$ | $\sigma^2_E + r\sigma^2_{OP} + or\sigma^2_P$ | | Operator (O) | $SS_O$ | $o-1$ | $MS_O$ | $\sigma^2_E + r\sigma^2_{OP} + pr\sigma^2_O$ | | P×O | $SS_{PO}$ | $(p-1)(o-1)$ | $MS_{PO}$ | $\sigma^2_E + r\sigma^2_{OP}$ | | Error (E) | $SS_E$ | $po(r-1)$ | $MS_E$ | $\sigma^2_E$ | #### Variance Component Estimates $$ \begin{aligned} \hat{\sigma}^2_{repeatability} &= MS_E \\ \hat{\sigma}^2_{operator} &= \frac{MS_O - MS_{PO}}{pr} \\ \hat{\sigma}^2_{interaction} &= \frac{MS_{PO} - MS_E}{r} \\ \hat{\sigma}^2_{reproducibility} &= \hat{\sigma}^2_{operator} + \hat{\sigma}^2_{interaction} \\ \hat{\sigma}^2_{part} &= \frac{MS_P - MS_{PO}}{or} \end{aligned} $$ ### 6.3 Key Metrics #### Percentage of Total Variation $$ \%GRR = 100 \times \frac{\sigma_{gauge}}{\sigma_{total}} $$ Or using study variation (5.15σ for 99%): $$ \%GRR = 100 \times \frac{5.15 \cdot \sigma_{gauge}}{5.15 \cdot \sigma_{total}} $$ #### Precision-to-Tolerance Ratio (P/T) $$ P/T = \frac{k \cdot \sigma_{gauge}}{USL - LSL} $$ Where $k = 5.15$ (99%) or $k = 6$ (99.73%). #### Number of Distinct Categories (ndc) $$ ndc = 1.41 \cdot \frac{\sigma_{part}}{\sigma_{gauge}} $$ ### 6.4 Acceptance Criteria | %GRR | Assessment | Action | |------|------------|--------| | < 10% | Excellent | Acceptable for all applications | | 10–30% | Acceptable | May be acceptable depending on application | | > 30% | Unacceptable | Measurement system needs improvement | | ndc | Assessment | |-----|------------| | ≥ 5 | Acceptable | | < 5 | Measurement system cannot distinguish parts | ## 7. Run Rules (Western Electric / Nelson Rules) ### 7.1 Standard Run Rules Pattern detection beyond simple control limits: | Rule | Pattern | Interpretation | |------|---------|----------------| | **1** | 1 point beyond ±3σ | Large shift or outlier | | **2** | 9 consecutive points on same side of CL | Small sustained shift | | **3** | 6 consecutive points steadily increasing or decreasing | Trend/drift | | **4** | 14 consecutive points alternating up and down | Systematic oscillation (over-adjustment) | | **5** | 2 of 3 consecutive points beyond ±2σ (same side) | Shift warning | | **6** | 4 of 5 consecutive points beyond ±1σ (same side) | Shift warning | | **7** | 15 consecutive points within ±1σ | Stratification (mixture) | | **8** | 8 consecutive points beyond ±1σ (either side) | Mixture of populations | ### 7.2 Zone Definitions Control charts are divided into zones: - **Zone A:** Between 2σ and 3σ from center line - **Zone B:** Between 1σ and 2σ from center line - **Zone C:** Within 1σ of center line ### 7.3 False Alarm Probabilities | Rule | Probability (per test) | |------|----------------------| | Rule 1 (±3σ) | 0.0027 | | Rule 2 (9 same side) | 0.0039 | | Rule 3 (6 trending) | 0.0028 | | Rule 5 (2 of 3 in Zone A) | 0.0044 | **Combined false alarm rate** increases when multiple rules are applied. ## 8. Average Run Length (ARL) ### 8.1 Definitions - **ARL₀ (In-Control ARL):** Average number of samples until false alarm when process is in control (want high) - **ARL₁ (Out-of-Control ARL):** Average number of samples to detect a shift (want low) ### 8.2 Shewhart Chart ARL For 3σ limits: $$ ARL_0 = \frac{1}{\alpha} = \frac{1}{0.0027} \approx 370 $$ For detecting a shift of $\delta$ standard deviations: $$ ARL_1 = \frac{1}{P(\text{signal} | \text{shift})} $$ $$ P(\text{signal}) = 1 - \Phi(3-\delta) + \Phi(-3-\delta) $$ ### 8.3 Comparison of Chart Performance | Shift ($\delta\sigma$) | Shewhart ARL₁ | EWMA ARL₁ ($\lambda$=0.1) | CUSUM ARL₁ | |------------------------|---------------|---------------------------|------------| | 0.25 | 281 | 66 | 38 | | 0.50 | 155 | 26 | 17 | | 0.75 | 81 | 15 | 10 | | 1.00 | 44 | 10 | 8 | | 1.50 | 15 | 5 | 5 | | 2.00 | 6 | 4 | 4 | | 3.00 | 2 | 2 | 2 | **Key insight:** EWMA and CUSUM are far superior for detecting small shifts ($\delta < 1.5\sigma$). ### 8.4 ARL Formulas for CUSUM Approximate ARL for CUSUM detecting shift of size $\delta$: $$ ARL_1 \approx \frac{e^{-2\Delta b} + 2\Delta b - 1}{2\Delta^2} $$ Where: - $\Delta = \delta - k$ (excess over reference value) - $b = h + 1.166$ (adjusted decision interval) ## 9. Multivariate SPC ### 9.1 Why Multivariate? Semiconductor processes involve many correlated parameters. Univariate charts on correlated variables: - Increase false alarm rates - Miss shifts in correlated directions - Fail to detect process changes that affect multiple parameters simultaneously ### 9.2 Hotelling's T² Statistic For $p$ variables measured on a sample of size $n$: $$ T^2 = n(\bar{\mathbf{x}} - \boldsymbol{\mu}_0)' \mathbf{S}^{-1} (\bar{\mathbf{x}} - \boldsymbol{\mu}_0) $$ Where: - $\bar{\mathbf{x}}$ = sample mean vector ($p \times 1$) - $\boldsymbol{\mu}_0$ = target mean vector ($p \times 1$) - $\mathbf{S}$ = sample covariance matrix ($p \times p$) ### 9.3 Control Limit for T² **Phase I (establishing control):** $$ UCL = \frac{(m-1)(m+1)p}{m(m-p)} F_{\alpha, p, m-p} $$ **Phase II (monitoring):** $$ UCL = \frac{p(m+1)(m-1)}{m(m-p)} F_{\alpha, p, m-p} $$ Where $m$ = number of historical samples. For large $m$: $$ UCL \approx \chi^2_{\alpha, p} $$ ### 9.4 Multivariate EWMA (MEWMA) $$ \mathbf{Z}_t = \Lambda\mathbf{X}_t + (\mathbf{I} - \Lambda)\mathbf{Z}_{t-1} $$ Where $\Lambda = \text{diag}(\lambda_1, \lambda_2, \ldots, \lambda_p)$. **Statistic:** $$ T^2_t = \mathbf{Z}_t' \boldsymbol{\Sigma}_{\mathbf{Z}_t}^{-1} \mathbf{Z}_t $$ **Covariance of MEWMA:** $$ \boldsymbol{\Sigma}_{\mathbf{Z}_t} = \frac{\lambda}{2-\lambda}\left[1 - (1-\lambda)^{2t}\right]\boldsymbol{\Sigma} $$ ### 9.5 Principal Component Analysis (PCA) for SPC Decompose correlated variables into uncorrelated principal components: $$ \mathbf{X} = \mathbf{T}\mathbf{P}' + \mathbf{E} $$ Where: - $\mathbf{T}$ = scores matrix - $\mathbf{P}$ = loadings matrix - $\mathbf{E}$ = residuals **Hotelling's T² in PC space:** $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$ **Squared Prediction Error (SPE):** $$ SPE = \mathbf{e}'\mathbf{e} = \sum_{i=k+1}^{p} t_i^2 $$ ## 10. Autocorrelation Handling ### 10.1 The Problem Semiconductor tool data often exhibits serial correlation, violating the independence assumption of standard SPC. **Consequences of ignoring autocorrelation:** - Actual ARL₀ << 370 (excessive false alarms) - Control limits are too tight - Patterns in data are misinterpreted ### 10.2 Autocorrelation Function (ACF) **Population autocorrelation at lag $k$:** $$ \rho_k = \frac{Cov(X_t, X_{t+k})}{Var(X_t)} = \frac{\gamma_k}{\gamma_0} $$ **Sample autocorrelation:** $$ r_k = \frac{\sum_{t=1}^{n-k}(x_t - \bar{x})(x_{t+k} - \bar{x})}{\sum_{t=1}^{n}(x_t - \bar{x})^2} $$ ### 10.3 AR(1) Process The simplest autocorrelated model: $$ X_t = \phi X_{t-1} + \varepsilon_t $$ Where: - $\phi$ = autoregressive parameter ($|\phi| < 1$ for stationarity) - $\varepsilon_t \sim N(0, \sigma^2_\varepsilon)$ (white noise) **Properties:** $$ \begin{aligned} Var(X_t) &= \frac{\sigma^2_\varepsilon}{1 - \phi^2} \\ \rho_k &= \phi^k \end{aligned} $$ ### 10.4 Solutions for Autocorrelated Data 1. **Residual Charts:** - Fit time series model (AR, ARMA, etc.) - Apply SPC to residuals $\hat{\varepsilon}_t = X_t - \hat{X}_t$ 2. **Modified Control Limits:** $$ UCL/LCL = \mu \pm 3\sigma_X \sqrt{\frac{1 + \phi}{1 - \phi}} $$ 3. **EWMA with Adjusted Parameters:** - Use $\lambda = 1 - \phi$ for optimal smoothing 4. **Special Cause Charts:** - Designed specifically for autocorrelated processes ## 11. Run-to-Run (R2R) Process Control ### 11.1 Basic Concept Active feedback control layered on SPC—adjust recipe parameters based on measured outputs. ### 11.2 EWMA Controller **Prediction:** $$ \hat{y}_{t+1} = \lambda y_t + (1-\lambda)\hat{y}_t $$ **Recipe Adjustment:** $$ u_{t+1} = u_t - G(\hat{y}_t - y_{target}) $$ Where: - $G$ = controller gain - $u$ = recipe parameter (e.g., etch time, dose) - $y$ = output measurement (e.g., CD, thickness) ### 11.3 Double EWMA (for Drifting Processes) Track both level and slope: **Level estimate:** $$ L_t = \lambda y_t + (1-\lambda)(L_{t-1} + T_{t-1}) $$ **Trend estimate:** $$ T_t = \gamma(L_t - L_{t-1}) + (1-\gamma)T_{t-1} $$ **Forecast:** $$ \hat{y}_{t+1} = L_t + T_t $$ ### 11.4 Process Model Integration For process with known gain $\beta$: $$ y_t = \alpha + \beta u_t + \varepsilon_t $$ **Optimal control:** $$ u_{t+1} = \frac{y_{target} - \hat{\alpha}_{t+1}}{\beta} $$ ## 12. Yield Modeling Mathematics ### 12.1 Defect Density $$ D_0 = \frac{\text{Number of defects}}{\text{Area (cm}^2\text{)}} $$ ### 12.2 Poisson Model (Random Defects) Assumes defects are randomly distributed: $$ Y = e^{-D_0 A} $$ Where: - $D_0$ = defect density (defects/cm²) - $A$ = die area (cm²) **Probability of $k$ defects on a die:** $$ P(k) = \frac{(D_0 A)^k e^{-D_0 A}}{k!} $$ ### 12.3 Murphy's Model (Distributed Defects) Accounts for defect density variation across wafer: $$ Y = \left[\frac{1 - e^{-D_0 A}}{D_0 A}\right]^2 $$ ### 12.4 Negative Binomial Model (Clustered Defects) More realistic for semiconductor: $$ Y = \left(1 + \frac{D_0 A}{\alpha}\right)^{-\alpha} $$ Where $\alpha$ = clustering parameter: - $\alpha \to \infty$: Approaches Poisson (random) - $\alpha$ small: Highly clustered ### 12.5 Seeds Model $$ Y = e^{-D_0 A_s} $$ Where $A_s$ = sensitive area (fraction of die area susceptible to defects). ### 12.6 Yield Loss Calculations **Defect-Limited Yield:** $$ Y_D = e^{-D_0 A} $$ **Parametric Yield:** $$ Y_P = \prod_{i} P(\text{parameter}_i \text{ in spec}) $$ **Total Yield:** $$ Y_{total} = Y_D \times Y_P $$ ## 13. Spatial Statistics for Wafer Maps ### 13.1 Radial Uniformity $$ \sigma_{radial} = \sqrt{\frac{1}{n}\sum_{i=1}^{n}(x_i - f(r_i))^2} $$ Where $f(r_i)$ is the fitted radial profile at radius $r_i$. ### 13.2 Wafer-Level Variation Components $$ \sigma^2_{total} = \sigma^2_{W2W} + \sigma^2_{WIW} $$ Within-wafer variation often decomposed: $$ \sigma^2_{WIW} = \sigma^2_{systematic} + \sigma^2_{random} $$ Where: - **Systematic WIW:** Modeled and corrected (radial, azimuthal patterns) - **Random WIW:** Inherent noise ### 13.3 Spatial Correlation Function For locations $\mathbf{s}_i$ and $\mathbf{s}_j$: $$ C(h) = Cov(X(\mathbf{s}_i), X(\mathbf{s}_j)) $$ Where $h = \|\mathbf{s}_i - \mathbf{s}_j\|$ (distance between points). **Variogram:** $$ \gamma(h) = \frac{1}{2}Var[X(\mathbf{s}_i) - X(\mathbf{s}_j)] $$ ### 13.4 Common Wafer Signatures Mathematical models for common spatial patterns: **Radial (bowl/dome):** $$ f(r) = a_0 + a_1 r + a_2 r^2 $$ **Azimuthal:** $$ f(\theta) = b_0 + b_1 \cos(\theta) + b_2 \sin(\theta) $$ **Combined:** $$ f(r, \theta) = \sum_{n,m} a_{nm} Z_n^m(r, \theta) $$ Where $Z_n^m$ are Zernike polynomials. ## 14. Practical Implementation Considerations ### 14.1 Sample Size Effects Uncertainty in estimated standard deviation: $$ SE(\hat{\sigma}) \approx \frac{\sigma}{\sqrt{2(n-1)}} $$ **For reliable capability estimates:** - Minimum: $n \geq 30$ - Preferred: $n \geq 50$ - For critical processes: $n \geq 100$ ### 14.2 Confidence Interval for σ $$ \sqrt{\frac{(n-1)s^2}{\chi^2_{\alpha/2, n-1}}} \leq \sigma \leq \sqrt{\frac{(n-1)s^2}{\chi^2_{1-\alpha/2, n-1}}} $$ ### 14.3 Rational Subgrouping **Principles:** - Subgroups should capture short-term (within) variation - Between-subgroup variation captures long-term drift - Subgroup size $n$ typically 3–5 for continuous data **In semiconductor:** - Subgroup = wafers from same lot, run, or time window - Site-to-site variation often treated as within-subgroup ### 14.4 Control Limit Estimation **Using Range Method:** $$ \hat{\sigma} = \frac{\bar{R}}{d_2} $$ **Using Sample Standard Deviation:** $$ \hat{\sigma} = \frac{\bar{s}}{c_4} $$ Where $c_4$ is the unbiasing constant for standard deviation. ### 14.5 Short-Run SPC For limited data (new process, low volume): **Z-MR charts using target:** $$ Z_i = \frac{x_i - T}{\sigma_0} $$ **Q-charts (self-starting):** $$ Q_i = \Phi^{-1}\left(F_{i-1}\left(\frac{x_i - \bar{x}_{i-1}}{s_{i-1}\sqrt{1 + 1/(i-1)}}\right)\right) $$ ## 15. Key Mathematical Relationships ### Quick Reference Table | Concept | Core Mathematics | |---------|------------------| | **Control Limits** | $\mu \pm \frac{3\sigma}{\sqrt{n}}$ | | **Cp** | $\frac{USL - LSL}{6\sigma}$ | | **Cpk** | $\min\left[\frac{USL-\mu}{3\sigma}, \frac{\mu-LSL}{3\sigma}\right]$ | | **EWMA** | $\lambda x_t + (1-\lambda)EWMA_{t-1}$ | | **CUSUM** | $\max[0, x_t - (\mu_0 + K) + C_{t-1}]$ | | **Hotelling's T²** | $n(\bar{\mathbf{x}}-\boldsymbol{\mu})'S^{-1}(\bar{\mathbf{x}}-\boldsymbol{\mu})$ | | **Gauge R&R** | $\sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge}$ | | **Yield (Poisson)** | $Y = e^{-D_0 A}$ | | **ARL₀ (3σ)** | $\frac{1}{0.0027} \approx 370$ | | **AR(1) Variance** | $\frac{\sigma^2_\varepsilon}{1-\phi^2}$ | ### Decision Guide: Which Chart to Use? | Situation | Recommended Chart | |-----------|------------------| | Standard monitoring, subgroups | X̄-R or X̄-S | | Individual measurements | I-MR | | Detect small shifts ($< 1.5\sigma$) | EWMA or CUSUM | | Multiple correlated parameters | Hotelling's T² or MEWMA | | Autocorrelated data | Residual charts or modified EWMA | | Short production runs | Q-charts or Z-MR | ### Critical Success Factors 1. **Validate measurement system first** (Gauge R&R < 10%) 2. **Ensure rational subgrouping** captures meaningful variation 3. **Check for autocorrelation** before applying standard charts 4. **Use appropriate capability indices** (Cpk vs Ppk) 5. **Decompose variance** to target improvement efforts 6. **Match chart sensitivity** to required detection speed ## Control Chart Constant Tables ### Constants for X̄ and R Charts | $n$ | $A_2$ | $A_3$ | $d_2$ | $d_3$ | $D_3$ | $D_4$ | $c_4$ | $B_3$ | $B_4$ | |-----|-------|-------|-------|-------|-------|-------|-------|-------|-------| | 2 | 1.880 | 2.659 | 1.128 | 0.853 | 0 | 3.267 | 0.7979 | 0 | 3.267 | | 3 | 1.023 | 1.954 | 1.693 | 0.888 | 0 | 2.574 | 0.8862 | 0 | 2.568 | | 4 | 0.729 | 1.628 | 2.059 | 0.880 | 0 | 2.282 | 0.9213 | 0 | 2.266 | | 5 | 0.577 | 1.427 | 2.326 | 0.864 | 0 | 2.114 | 0.9400 | 0 | 2.089 | | 6 | 0.483 | 1.287 | 2.534 | 0.848 | 0 | 2.004 | 0.9515 | 0.030 | 1.970 | | 7 | 0.419 | 1.182 | 2.704 | 0.833 | 0.076 | 1.924 | 0.9594 | 0.118 | 1.882 | | 8 | 0.373 | 1.099 | 2.847 | 0.820 | 0.136 | 1.864 | 0.9650 | 0.185 | 1.815 | | 9 | 0.337 | 1.032 | 2.970 | 0.808 | 0.184 | 1.816 | 0.9693 | 0.239 | 1.761 | | 10 | 0.308 | 0.975 | 3.078 | 0.797 | 0.223 | 1.777 | 0.9727 | 0.284 | 1.716 | ### Standard Normal Distribution Critical Values | Confidence | $z_{\alpha/2}$ | |------------|----------------| | 90% | 1.645 | | 95% | 1.960 | | 99% | 2.576 | | 99.73% | 3.000 |

speaker adaptation, audio & speech

Speaker adaptation personalizes ASR models to individual speakers using small amounts of enrollment data.

speaker beam, audio & speech

SpeakerBeam performs target speaker extraction using speaker embeddings as queries.

speaker diarization, audio & speech

Speaker diarization segments audio by speaker identity answering who spoke when.

speaker diarization,audio

Identify who spoke when in multi-speaker audio.

speaker embedding, audio & speech

Speaker embeddings are fixed-dimensional representations capturing speaker identity extracted from speech using neural networks like x-vectors.

spearman correlation, quality & reliability

Spearman correlation assesses monotonic relationships using rank order.

specaugment, audio & speech

SpecAugment applies frequency and time masking to spectrograms for data augmentation in ASR.

special cause variation,spc

Variation from specific identifiable source.

special tokens, nlp

Reserved tokens for specific functions.

special tokens,nlp

Tokens with special meaning (BOS EOS PAD MASK SEP CLS).

specialist agent, ai agents

Specialist agents focus on specific capabilities contributing expertise to team efforts.

specialty gas, manufacturing operations

Specialty gases have high purity and specific compositions for processes.