← Back to AI Factory Chat

AI Factory Glossary

72 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 2 of 2 (72 entries)

porosimetry, metrology

Measure porosity of low-k dielectrics.

positive resist,lithography

Exposed areas become soluble and are removed.

positron annihilation spectroscopy, pas, metrology

Detect voids and open volume.

post-apply bake (pab),post-apply bake,pab,lithography

Bake after coating resist.

post-exposure bake (peb),post-exposure bake,peb,lithography

Bake after exposure to complete chemical reactions.

post-mold cure, pmc, packaging

Additional curing after molding.

pot, packaging

Holds molding compound.

power spectral density analysis, psd, metrology

Frequency analysis of surface roughness.

precession electron diffraction, ped, metrology

Reduce dynamical effects in diffraction.

precision,metrology

Repeatability of measurements.

predictive metrology, metrology

Forecast future metrology results.

pressure sensor packaging, packaging

Special considerations for pressure sensors.

process monitor structures, metrology

Test structures tracking process.

process monitoring, semiconductor process control, spc, statistical process control, sensor data, fault detection, run-to-run control, process optimization

# Semiconductor Manufacturing Process Parameters Monitoring: Mathematical Modeling ## 1. The Fundamental Challenge Modern semiconductor fabrication involves 500–1000+ sequential process steps, each with dozens of parameters requiring nanometer-scale precision. ### Key Process Types and Parameters - **Lithography**: exposure dose, focus, overlay alignment, resist thickness - **Etching (dry/wet)**: etch rate, selectivity, uniformity, plasma parameters (power, pressure, gas flows) - **Deposition (CVD, PVD, ALD)**: deposition rate, film thickness, uniformity, stress, composition - **CMP (Chemical Mechanical Polishing)**: removal rate, within-wafer non-uniformity, dishing, erosion - **Implantation**: dose, energy, angle, uniformity - **Thermal processes**: temperature uniformity, ramp rates, time ## 2. Statistical Process Control (SPC) — The Foundation ### 2.1 Univariate Control Charts For a process parameter $X$ with samples $x_1, x_2, \ldots, x_n$: **Sample Mean:** $$ \bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i $$ **Sample Standard Deviation:** $$ \sigma = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2} $$ **Control Limits (3-sigma):** $$ \text{UCL} = \bar{x} + 3\sigma $$ $$ \text{LCL} = \bar{x} - 3\sigma $$ ### 2.2 Process Capability Indices These quantify how well a process meets specifications: - **$C_p$ (Potential Capability):** $$ C_p = \frac{USL - LSL}{6\sigma} $$ - **$C_{pk}$ (Actual Capability)** — accounts for centering: $$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$ - **$C_{pm}$ (Taguchi Index)** — penalizes deviation from target $T$: $$ C_{pm} = \frac{C_p}{\sqrt{1 + \left(\frac{\mu - T}{\sigma}\right)^2}} $$ Semiconductor fabs typically require $C_{pk} \geq 1.67$, corresponding to defect rates below ~1 ppm. ## 3. Multivariate Statistical Monitoring Since process parameters are highly correlated, univariate methods miss interaction effects. ### 3.1 Principal Component Analysis (PCA) Given data matrix $\mathbf{X}$ ($n$ samples × $p$ variables), centered: 1. **Compute covariance matrix:** $$ \mathbf{S} = \frac{1}{n-1}\mathbf{X}^T\mathbf{X} $$ 2. **Eigendecomposition:** $$ \mathbf{S} = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^T $$ 3. **Project to principal components:** $$ \mathbf{T} = \mathbf{X}\mathbf{V} $$ ### 3.2 Monitoring Statistics #### Hotelling's $T^2$ Statistic Captures variation **within** the PCA model: $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$ where $k$ is the number of retained components. Under normal operation, $T^2$ follows a scaled F-distribution. #### Q-Statistic (Squared Prediction Error) Captures variation **outside** the model: $$ Q = \sum_{j=1}^{p}(x_j - \hat{x}_j)^2 = \|\mathbf{x} - \mathbf{x}\mathbf{V}_k\mathbf{V}_k^T\|^2 $$ > Often more sensitive to novel faults than $T^2$. ### 3.3 Partial Least Squares (PLS) When relating process inputs $\mathbf{X}$ to quality outputs $\mathbf{Y}$: $$ \mathbf{Y} = \mathbf{X}\mathbf{B} + \mathbf{E} $$ PLS finds latent variables that maximize covariance between $\mathbf{X}$ and $\mathbf{Y}$, providing both monitoring capability and a predictive model. ## 4. Virtual Metrology (VM) Models Virtual metrology predicts physical measurement outcomes from process sensor data, enabling 100% wafer coverage without costly measurements. ### 4.1 Linear Models For process parameters $\mathbf{x} \in \mathbb{R}^p$ and metrology target $y$: - **Ordinary Least Squares (OLS):** $$ \hat{\boldsymbol{\beta}} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y} $$ - **Ridge Regression** ($L_2$ regularization for collinearity): $$ \hat{\boldsymbol{\beta}} = (\mathbf{X}^T\mathbf{X} + \lambda\mathbf{I})^{-1}\mathbf{X}^T\mathbf{y} $$ - **LASSO** ($L_1$ regularization for sparsity/feature selection): $$ \min_{\boldsymbol{\beta}} \|\mathbf{y} - \mathbf{X}\boldsymbol{\beta}\|^2 + \lambda\|\boldsymbol{\beta}\|_1 $$ ### 4.2 Nonlinear Models #### Gaussian Process Regression (GPR) $$ y \sim \mathcal{GP}(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')) $$ **Posterior predictive distribution:** - **Mean:** $$ \mu_* = \mathbf{K}_*^T(\mathbf{K} + \sigma_n^2\mathbf{I})^{-1}\mathbf{y} $$ - **Variance:** $$ \sigma_*^2 = K_{**} - \mathbf{K}_*^T(\mathbf{K} + \sigma_n^2\mathbf{I})^{-1}\mathbf{K}_* $$ GPs provide uncertainty quantification — critical for knowing when to trigger actual metrology. #### Support Vector Regression (SVR) $$ \min \frac{1}{2}\|\mathbf{w}\|^2 + C\sum_i(\xi_i + \xi_i^*) $$ Subject to $\epsilon$-insensitive tube constraints. Kernel trick enables nonlinear modeling. #### Neural Networks - **MLPs**: Multi-layer perceptrons for general function approximation - **CNNs**: Convolutional neural networks for wafer map pattern recognition - **LSTMs**: Long Short-Term Memory networks for time-series FDC traces ## 5. Run-to-Run (R2R) Control R2R control adjusts recipe setpoints between wafers/lots to compensate for drift and disturbances. ### 5.1 EWMA Controller For a process with model $y = a_0 + a_1 u + \epsilon$: **Prediction update:** $$ \hat{y}_{k+1} = \lambda y_k + (1-\lambda)\hat{y}_k $$ **Control action:** $$ u_{k+1} = \frac{T - \hat{y}_{k+1} + a_0}{a_1} $$ where: - $T$ is the target - $\lambda \in (0,1)$ is the smoothing weight ### 5.2 Double EWMA (for Linear Drift) When process drifts linearly: $$ \hat{y}_{k+1} = a_k + b_k $$ $$ a_k = \lambda y_k + (1-\lambda)(a_{k-1} + b_{k-1}) $$ $$ b_k = \gamma(a_k - a_{k-1}) + (1-\gamma)b_{k-1} $$ ### 5.3 State-Space Formulation More general framework: **State equation:** $$ \mathbf{x}_{k+1} = \mathbf{A}\mathbf{x}_k + \mathbf{B}\mathbf{u}_k + \mathbf{w}_k $$ **Observation equation:** $$ \mathbf{y}_k = \mathbf{C}\mathbf{x}_k + \mathbf{D}\mathbf{u}_k + \mathbf{v}_k $$ Use **Kalman filtering** for state estimation and **LQR/MPC** for optimal control. ### 5.4 Model Predictive Control (MPC) **Objective function:** $$ \min \sum_{i=1}^{N} \|\mathbf{y}_{k+i} - \mathbf{r}_{k+i}\|_\mathbf{Q}^2 + \sum_{j=0}^{N-1}\|\Delta\mathbf{u}_{k+j}\|_\mathbf{R}^2 $$ subject to process model and operational constraints. > MPC handles multivariable systems with constraints naturally. ## 6. Fault Detection and Classification (FDC) ### 6.1 Detection Methods #### Mahalanobis Distance $$ D^2 = (\mathbf{x} - \boldsymbol{\mu})^T\mathbf{S}^{-1}(\mathbf{x} - \boldsymbol{\mu}) $$ Follows $\chi^2$ distribution under multivariate normality. #### Other Detection Methods - **One-Class SVM**: Learn boundary of normal operation - **Autoencoders**: Detect anomalies via reconstruction error ### 6.2 Classification Features For trace data (time-series from sensors), extract features: - **Statistical moments**: mean, variance, skewness, kurtosis - **Frequency domain**: FFT coefficients, spectral power - **Wavelet coefficients**: Multi-resolution analysis - **DTW distances**: Dynamic Time Warping to reference signatures ### 6.3 Classification Algorithms - Support Vector Machines (SVM) - Random Forest - CNNs for pattern recognition on wafer maps - Gradient Boosting (XGBoost, LightGBM) ## 7. Spatial Modeling (Within-Wafer Variation) Systematic spatial patterns require explicit modeling. ### 7.1 Polynomial Basis Expansion #### Zernike Polynomials (common in lithography) $$ z(\rho, \theta) = \sum_{n,m} Z_n^m(\rho, \theta) $$ These form an orthogonal basis on the unit disk, capturing radial and azimuthal variation. ### 7.2 Gaussian Process Spatial Models $$ y(\mathbf{s}) \sim \mathcal{GP}(\mu(\mathbf{s}), k(\mathbf{s}, \mathbf{s}')) $$ #### Common Covariance Kernels - **Squared Exponential (RBF):** $$ k(\mathbf{s}, \mathbf{s}') = \sigma^2 \exp\left(-\frac{\|\mathbf{s} - \mathbf{s}'\|^2}{2\ell^2}\right) $$ - **Matérn** (more flexible smoothness): $$ k(r) = \sigma^2 \frac{2^{1-\nu}}{\Gamma(\nu)}\left(\frac{\sqrt{2\nu}r}{\ell}\right)^\nu K_\nu\left(\frac{\sqrt{2\nu}r}{\ell}\right) $$ where $K_\nu$ is the modified Bessel function of the second kind. ## 8. Dynamic/Time-Series Modeling For plasma processes, endpoint detection, and transient behavior. ### 8.1 Autoregressive Models **AR(p) model:** $$ x_t = \sum_{i=1}^{p} \phi_i x_{t-i} + \epsilon_t $$ ARIMA extends this to non-stationary series. ### 8.2 Dynamic PCA Augment data with time-lagged values: $$ \tilde{\mathbf{X}} = [\mathbf{X}(t), \mathbf{X}(t-1), \ldots, \mathbf{X}(t-l)] $$ Then apply standard PCA to capture temporal dynamics. ### 8.3 Deep Sequence Models #### LSTM Networks Gating mechanisms: - **Forget gate:** $f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)$ - **Input gate:** $i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)$ - **Output gate:** $o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)$ **Cell state update:** $$ c_t = f_t \odot c_{t-1} + i_t \odot \tilde{c}_t $$ **Hidden state:** $$ h_t = o_t \odot \tanh(c_t) $$ ## 9. Model Maintenance and Adaptation Semiconductor processes drift — models must adapt. ### 9.1 Drift Detection Methods #### CUSUM (Cumulative Sum) $$ S_k = \max(0, S_{k-1} + (x_k - \mu_0) - k) $$ Signal when $S_k$ exceeds threshold. #### Page-Hinkley Test $$ m_k = \sum_{i=1}^{k}(x_i - \bar{x}_k - \delta) $$ $$ M_k = \max_{i \leq k} m_i $$ Alarm when $M_k - m_k > \lambda$. #### ADWIN (Adaptive Windowing) Automatically detects distribution changes and adjusts window size. ### 9.2 Online Model Updating #### Recursive Least Squares (RLS) $$ \hat{\boldsymbol{\beta}}_k = \hat{\boldsymbol{\beta}}_{k-1} + \mathbf{K}_k(y_k - \mathbf{x}_k^T\hat{\boldsymbol{\beta}}_{k-1}) $$ where $\mathbf{K}_k$ is the gain matrix updated via the Riccati equation: $$ \mathbf{K}_k = \frac{\mathbf{P}_{k-1}\mathbf{x}_k}{\lambda + \mathbf{x}_k^T\mathbf{P}_{k-1}\mathbf{x}_k} $$ $$ \mathbf{P}_k = \frac{1}{\lambda}(\mathbf{P}_{k-1} - \mathbf{K}_k\mathbf{x}_k^T\mathbf{P}_{k-1}) $$ #### Just-in-Time (JIT) Learning Build local models around each new prediction point using nearest historical samples. ## 10. Integrated Framework A complete monitoring system layers these methods: | Layer | Methods | Purpose | |-------|---------|---------| | **Preprocessing** | Cleaning, synchronization, normalization | Data quality | | **Feature Engineering** | Domain features, wavelets, PCA | Dimensionality management | | **Monitoring** | $T^2$, Q-statistic, control charts | Detect out-of-control states | | **Virtual Metrology** | PLS, GPR, neural networks | Predict quality without measurement | | **FDC** | Classification models | Diagnose fault root causes | | **Control** | R2R, MPC | Compensate for drift/disturbances | | **Adaptation** | Online learning, drift detection | Maintain model validity | ## 11. Key Mathematical Challenges 1. **High dimensionality** — hundreds of sensors, requiring regularization and dimension reduction 2. **Collinearity** — process variables are physically coupled 3. **Non-stationarity** — drift, maintenance events, recipe changes 4. **Small sample sizes** — new recipes have limited historical data (transfer learning, Bayesian methods help) 5. **Real-time constraints** — decisions needed in seconds 6. **Rare events** — faults are infrequent, creating class imbalance ## 12. Key Equations ### Process Capability $$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$ ### Multivariate Monitoring $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i}, \quad Q = \|\mathbf{x} - \hat{\mathbf{x}}\|^2 $$ ### Virtual Metrology (Ridge Regression) $$ \hat{\boldsymbol{\beta}} = (\mathbf{X}^T\mathbf{X} + \lambda\mathbf{I})^{-1}\mathbf{X}^T\mathbf{y} $$ ### EWMA Control $$ \hat{y}_{k+1} = \lambda y_k + (1-\lambda)\hat{y}_k $$ ### Mahalanobis Distance $$ D^2 = (\mathbf{x} - \boldsymbol{\mu})^T\mathbf{S}^{-1}(\mathbf{x} - \boldsymbol{\mu}) $$

process window analysis, lithography

Determine usable focus-dose range.

process window qualification, pwq, lithography

Verify adequate process window.

process window,exposure-defocus,bossung,depth of focus,dof,exposure latitude,cpk,lithography window,semiconductor process window

# Process Window 1. Fundamental A process window is the region in parameter space where a manufacturing step yields acceptable results. Mathematically, for a response function $y(\mathbf{x})$ depending on parameter vector $\mathbf{x} = (x_1, x_2, \ldots, x_n)$: $$ \text{Process Window} = \{\mathbf{x} : y_{\min} \leq y(\mathbf{x}) \leq y_{\max}\} $$ 2. Single-Parameter Statistics For a single parameter with lower and upper specification limits (LSL, USL): Process Capability Indices - $C_p$ (Process Capability): Measures window width relative to process variation $$ C_p = \frac{USL - LSL}{6\sigma} $$ - $C_{pk}$ (Process Capability Index): Accounts for process centering $$ C_{pk} = \min\left[\frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma}\right] $$ Industry Standards - $C_p \geq 1.0$: Process variation fits within specifications - $C_{pk} \geq 1.33$: 4σ capability (standard requirement) - $C_{pk} \geq 1.67$: 5σ capability (high-reliability applications) - $C_{pk} \geq 2.0$: 6σ capability (Six Sigma standard) 3. Lithography: Exposure-Defocus (E-D) Window The most critical and mathematically developed process window in semiconductor manufacturing. 3.1 Bossung Curve Model Critical dimension (CD) as a function of exposure dose $E$ and defocus $F$: $$ CD(E, F) = CD_0 + a_1 E + a_2 F + a_{11} E^2 + a_{22} F^2 + a_{12} EF + \ldots $$ The process window boundary is defined by: $$ |CD(E, F) - CD_{\text{target}}| = \Delta CD_{\text{tolerance}} $$ 3.2 Key Metrics - Exposure Latitude (EL): Percentage dose range for acceptable CD $$ EL = \frac{E_{\max} - E_{\min}}{E_{\text{nominal}}} \times 100\% $$ - Depth of Focus (DOF): Focus range for acceptable CD (at given EL) $$ DOF = F_{\max} - F_{\min} $$ - Process Window Area: Total acceptable region $$ A_{PW} = \iint_{\text{acceptable}} dE \, dF $$ 3.3 Rayleigh Equations Resolution and DOF scale with wavelength $\lambda$ and numerical aperture $NA$: - Resolution (minimum feature size): $$ R = k_1 \frac{\lambda}{NA} $$ - Depth of Focus: $$ DOF = \pm k_2 \frac{\lambda}{NA^2} $$ Critical insight: As $k_1$ decreases (smaller features), DOF shrinks as $(k_1)^2$ — process windows collapse rapidly at advanced nodes. | Technology Node | $k_1$ Factor | Relative DOF | | --| --| --| | 180nm | 0.6 | 1.0 | | 65nm | 0.4 | 0.44 | | 14nm | 0.3 | 0.25 | | 5nm (EUV) | 0.25 | 0.17 | 4. Image Quality Metrics 4.1 Normalized Image Log-Slope (NILS) $$ NILS = w \cdot \frac{1}{I} \left|\frac{dI}{dx}\right|_{\text{edge}} $$ Where: - $w$ = feature width - $I$ = aerial image intensity - $\frac{dI}{dx}$ = intensity gradient at feature edge For a coherent imaging system with partial coherence $\sigma$: $$ NILS \approx \pi \cdot \frac{w}{\lambda/NA} \cdot \text{(contrast factor)} $$ Interpretation: - Higher NILS → larger process window - NILS > 2.0: Robust process - NILS < 1.5: Marginal process window - NILS < 1.0: Near resolution limit 4.2 Mask Error Enhancement Factor (MEEF) $$ MEEF = \frac{\partial CD_{\text{wafer}}}{\partial CD_{\text{mask}}} $$ Characteristics: - MEEF = 1: Ideal (1:1 transfer from mask to wafer) - MEEF > 1: Mask errors are amplified on wafer - Near resolution limit: MEEF typically 3–4 or higher - Impacts effective process window: mask CD tolerance = wafer CD tolerance / MEEF 5. Multi-Parameter Process Windows 5.1 Ellipsoid Model For $n$ interacting parameters, the window is often an $n$-dimensional ellipsoid: $$ (\mathbf{x} - \mathbf{x}_0)^T \mathbf{A} (\mathbf{x} - \mathbf{x}_0) \leq 1 $$ Where: - $\mathbf{x}$ = parameter vector $(x_1, x_2, \ldots, x_n)$ - $\mathbf{x}_0$ = optimal operating point (center of ellipsoid) - $\mathbf{A}$ = positive definite matrix encoding parameter correlations Geometric interpretation: - Eigenvalues of $\mathbf{A}$: $\lambda_1, \lambda_2, \ldots, \lambda_n$ - Principal axes lengths: $a_i = 1/\sqrt{\lambda_i}$ - Eigenvectors: orientation of principal axes 5.2 Overlapping Windows Real processes require multiple steps to simultaneously work: $$ PW_{\text{total}} = \bigcap_{i=1}^{N} PW_i $$ Example: Combined lithography + etch window $$ PW_{\text{combined}} = PW_{\text{litho}}(E, F) \cap PW_{\text{etch}}(P, W, T) $$ If individual windows are ellipsoids, their intersection is a more complex polytope — often computed numerically via: - Linear programming - Convex hull algorithms - Monte Carlo sampling 6. Response Surface Methodology (RSM) 6.1 Quadratic Model $$ y = \beta_0 + \sum_{i=1}^{n} \beta_i x_i + \sum_{i=1}^{n} \beta_{ii} x_i^2 + \sum_{i 3–5 (typical) - Selectivity > 10 (high aspect ratio features) - Selectivity > 50 (critical etch stop layers) 13. CMP Process Windows 13.1 Preston Equation $$ RR = K_p \cdot P \cdot V $$ Where: - $RR$ = removal rate (nm/min or Å/min) - $K_p$ = Preston coefficient (material/consumable dependent) - $P$ = applied pressure (psi or kPa) - $V$ = relative velocity (m/s) 13.2 Within-Wafer Non-Uniformity (WIWNU) $$ WIWNU = \frac{\sigma_{RR}}{\mu_{RR}} \times 100\% $$ Target: WIWNU < 3–5% 13.3 Dishing and Erosion - Dishing: Excess removal at center of wide features $$ \text{Dishing} = t_{\text{initial}} - t_{\text{center}} $$ - Erosion: Thinning of dielectric between metal lines $$ \text{Erosion} = t_{\text{field}} - t_{\text{local}} $$ 14. Key Equations Summary Table | Metric | Formula | Significance | | --| | --| | Resolution | $R = k_1 \frac{\lambda}{NA}$ | Minimum feature size | | Depth of Focus | $DOF = \pm k_2 \frac{\lambda}{NA^2}$ | Focus tolerance | | NILS | $NILS = \frac{w}{I} \left\|\frac{dI}{dx}\right\|$ | Image contrast at edge | | MEEF | $MEEF = \frac{\partial CD_w}{\partial CD_m}$ | Mask error amplification | | Process Capability | $C_{pk} = \frac{\min(USL-\mu, \mu-LSL)}{3\sigma}$ | Process capability | | Exposure Latitude | $EL = \frac{E_{max} - E_{min}}{E_{nom}} \times 100\%$ | Dose tolerance | | Stochastic LER | $LER \propto \frac{1}{\sqrt{Dose}}$ | Shot noise floor | | Yield (Poisson) | $Y = e^{-DA}$ | Defect-limited yield | | Preston Equation | $RR = K_p P V$ | CMP removal rate | 15. Modern Computational Approaches 15.1 Monte Carlo Simulation Algorithm: Monte Carlo Yield Estimation 1. Define parameter distributions: x_i ~ N(μ_i, σ_i²) 2. For trial = 1 to N_trials: a. Sample x from joint distribution b. Evaluate y(x) for all responses c. Check if y ∈ [y_min, y_max] for all responses d. Record pass/fail 3. Yield = N_pass / N_trials 4. Confidence interval: Y ± z_α √(Y(1-Y)/N) 15.2 Machine Learning Classification - Support Vector Machine (SVM): Decision boundary defines process window - Neural Networks: Complex, non-convex window shapes - Random Forest: Ensemble method for robustness - Gaussian Process: Probabilistic boundaries with uncertainty 15.3 Digital Twin Approach $$ \hat{y}_{t+1} = f(y_t, \mathbf{x}_t, \boldsymbol{\theta}) $$ Where: - $\hat{y}_{t+1}$ = predicted next-step output - $y_t$ = current measured output - $\mathbf{x}_t$ = current process parameters - $\boldsymbol{\theta}$ = model parameters (updated via Bayesian inference) 16. Advanced Node Challenges 16.1 Process Window Shrinkage At advanced nodes (sub-7nm), multiple factors compound: $$ PW_{\text{effective}} = PW_{\text{optical}} \cap PW_{\text{stochastic}} \cap PW_{\text{overlay}} \cap PW_{\text{etch}} $$ 16.2 Multi-Patterning Complexity For N-patterning (e.g., SAQP with N=4): $$ \sigma_{\text{total}}^2 = \sum_{i=1}^{N} \sigma_{\text{step}_i}^2 $$ Error budget per step: $$ \sigma_{\text{step}} = \frac{\sigma_{\text{target}}}{\sqrt{N}} $$ 16.3 Design-Technology Co-Optimization (DTCO) $$ \text{Objective: } \max_{\text{design}, \text{process}} \left[ \text{Performance} \times Y(\text{design}, \text{process}) \right] $$ Subject to: - Design rules: $DR_i(\text{layout}) \geq 0$ - Process windows: $\mathbf{x} \in PW$ - Reliability: $MTTF \geq \text{target}$

product representative structures, metrology

Tests matching actual devices.

profilometry,metrology

Measure surface height profile mechanically or optically.

ptychography, metrology

Phase retrieval technique for imaging.

pvd,physical vapor deposition,what is pvd,sputtering,magnetron sputtering,ipvd,ionized pvd,evaporation

# Mathematical Modeling of Metal Deposition in Semiconductor Manufacturing 1. Overview: Metal Deposition Processes Metal deposition is a critical step in semiconductor fabrication, creating interconnects, contacts, barrier layers, and various metallic structures. The primary deposition methods require distinct mathematical treatments: | Process | Physics Domain | Key Mathematics | |---------|----------------|-----------------| | **PVD (Sputtering)** | Ballistic transport, plasma physics | Boltzmann transport, Monte Carlo | | **CVD/PECVD** | Gas-phase transport, surface reactions | Navier-Stokes, reaction-diffusion | | **ALD** | Self-limiting surface chemistry | Site-balance kinetics | | **Electroplating (ECD)** | Electrochemistry, mass transport | Butler-Volmer, Nernst-Planck | 2. Transport Phenomena Models 2.1 Gas-Phase Transport (CVD/PECVD) The precursor concentration field follows the **convection-diffusion-reaction equation**: $$ \frac{\partial C}{\partial t} + \mathbf{v} \cdot \nabla C = D \nabla^2 C + R_{gas} $$ Where: - $C$ — precursor concentration (mol/m³) - $\mathbf{v}$ — velocity field vector (m/s) - $D$ — diffusion coefficient (m²/s) - $R_{gas}$ — gas-phase reaction source term (mol/m³·s) 2.2 Flow Field Equations The **incompressible Navier-Stokes equations** govern the velocity field: $$ \rho \left( \frac{\partial \mathbf{v}}{\partial t} + \mathbf{v} \cdot \nabla \mathbf{v} \right) = -\nabla p + \mu \nabla^2 \mathbf{v} $$ With continuity equation: $$ \nabla \cdot \mathbf{v} = 0 $$ Where: - $\rho$ — gas density (kg/m³) - $p$ — pressure (Pa) - $\mu$ — dynamic viscosity (Pa·s) ### 2.3 Knudsen Number and Transport Regimes At low pressures, the **Knudsen number** determines the transport regime: $$ Kn = \frac{\lambda}{L} = \frac{k_B T}{\sqrt{2} \pi d^2 p L} $$ Where: - $\lambda$ — mean free path (m) - $L$ — characteristic length (m) - $k_B$ — Boltzmann constant ($1.38 \times 10^{-23}$ J/K) - $T$ — temperature (K) - $d$ — molecular diameter (m) - $p$ — pressure (Pa) **Transport regime classification:** - $Kn < 0.01$ — **Continuum regime** → Navier-Stokes CFD - $0.01 < Kn < 0.1$ — **Slip flow regime** → Modified NS with slip boundary conditions - $0.1 < Kn < 10$ — **Transitional regime** → DSMC, Boltzmann equation - $Kn > 10$ — **Free molecular regime** → Ballistic/Monte Carlo methods 3. Surface Reaction Kinetics 3.1 Langmuir-Hinshelwood Mechanism For bimolecular surface reactions (common in CVD): $$ r = \frac{k \cdot K_A K_B \cdot p_A p_B}{(1 + K_A p_A + K_B p_B)^2} $$ Where: - $r$ — reaction rate (mol/m²·s) - $k$ — surface reaction rate constant (mol/m²·s) - $K_A, K_B$ — adsorption equilibrium constants (Pa⁻¹) - $p_A, p_B$ — partial pressures of reactants A and B (Pa) 3.2 Sticking Coefficient Model The probability that an impinging molecule adsorbs on the surface: $$ S = S_0 \exp\left( -\frac{E_a}{k_B T} \right) \cdot f(\theta) $$ Where: - $S$ — sticking coefficient (dimensionless) - $S_0$ — pre-exponential sticking factor - $E_a$ — activation energy (J) - $f(\theta) = (1 - \theta)^n$ — site blocking function - $\theta$ — surface coverage (dimensionless, 0 to 1) - $n$ — order of site blocking 3.3 Arrhenius Temperature Dependence $$ k(T) = A \exp\left( -\frac{E_a}{RT} \right) $$ Where: - $A$ — pre-exponential factor (frequency factor) - $E_a$ — activation energy (J/mol) - $R$ — universal gas constant (8.314 J/mol·K) - $T$ — absolute temperature (K) 4. Film Growth Models 4.1 Continuum Surface Evolution Edwards-Wilkinson Equation (Linear Growth) $$ \frac{\partial h}{\partial t} = \nu \nabla^2 h + F + \eta(\mathbf{x}, t) $$ Kardar-Parisi-Zhang (KPZ) Equation (Nonlinear Growth) $$ \frac{\partial h}{\partial t} = \nu \nabla^2 h + \frac{\lambda}{2} |\nabla h|^2 + F + \eta $$ Where: - $h(\mathbf{x}, t)$ — surface height at position $\mathbf{x}$ and time $t$ - $\nu$ — surface diffusion coefficient (m²/s) - $\lambda$ — nonlinear growth parameter - $F$ — mean deposition flux (m/s) - $\eta$ — stochastic noise term (Gaussian white noise) 4.2 Scaling Relations Surface roughness evolves according to: $$ W(L, t) = L^\alpha f\left( \frac{t}{L^z} \right) $$ Where: - $W$ — interface width (roughness) - $L$ — system size - $\alpha$ — roughness exponent - $z$ — dynamic exponent - $f$ — scaling function 5. Step Coverage and Conformality 5.1 Thiele Modulus For high-aspect-ratio features, the **Thiele modulus** determines conformality: $$ \phi = L \sqrt{\frac{k_s}{D_{eff}}} $$ Where: - $\phi$ — Thiele modulus (dimensionless) - $L$ — feature depth (m) - $k_s$ — surface reaction rate constant (m/s) - $D_{eff}$ — effective diffusivity (m²/s) **Step coverage regimes:** - $\phi \ll 1$ — **Reaction-limited** → Excellent conformality - $\phi \gg 1$ — **Transport-limited** → Poor step coverage (bread-loafing) 5.2 Knudsen Diffusion in Trenches $$ D_K = \frac{w}{3} \sqrt{\frac{8 R T}{\pi M}} $$ Where: - $D_K$ — Knudsen diffusion coefficient (m²/s) - $w$ — trench width (m) - $R$ — universal gas constant (J/mol·K) - $T$ — temperature (K) - $M$ — molecular weight (kg/mol) 5.3 Feature-Scale Concentration Profile Solving for concentration in a trench with reactive walls: $$ D_{eff} \frac{d^2 C}{dy^2} = \frac{2 k_s C}{w} $$ General solution: $$ C(y) = C_0 \frac{\cosh\left( \phi \frac{L - y}{L} \right)}{\cosh(\phi)} $$ 6. Atomic Layer Deposition (ALD) Models 6.1 Self-Limiting Surface Kinetics Surface site balance equation: $$ \frac{d\theta}{dt} = k_a C (1 - \theta) - k_d \theta $$ Where: - $\theta$ — fractional surface coverage - $k_a$ — adsorption rate constant (m³/mol·s) - $k_d$ — desorption rate constant (s⁻¹) - $C$ — gas-phase precursor concentration (mol/m³) At equilibrium saturation: $$ \theta_{eq} = \frac{k_a C}{k_a C + k_d} \approx 1 \quad \text{(for strong chemisorption)} $$ 6.2 Growth Per Cycle (GPC) $$ \text{GPC} = \Gamma_0 \cdot \Omega \cdot \eta $$ Where: - $\Gamma_0$ — surface site density (sites/m²) - $\Omega$ — volume per deposited atom (m³) - $\eta$ — reaction efficiency (dimensionless) 6.3 Saturation Dose-Time Relationship $$ \theta(t) = 1 - \exp\left( -\frac{S \cdot \Phi \cdot t}{\Gamma_0} \right) $$ **Impingement flux** from kinetic theory: $$ \Phi = \frac{p}{\sqrt{2 \pi m k_B T}} $$ Where: - $\Phi$ — molecular impingement flux (molecules/m²·s) - $p$ — precursor partial pressure (Pa) - $m$ — molecular mass (kg) 7. Plasma Modeling (PVD/PECVD) 7.1 Plasma Sheath Physics **Child-Langmuir law** for ion current density: $$ J_{ion} = \frac{4 \varepsilon_0}{9} \sqrt{\frac{2e}{M_i}} \frac{V_s^{3/2}}{d_s^2} $$ Where: - $J_{ion}$ — ion current density (A/m²) - $\varepsilon_0$ — vacuum permittivity ($8.85 \times 10^{-12}$ F/m) - $e$ — elementary charge ($1.6 \times 10^{-19}$ C) - $M_i$ — ion mass (kg) - $V_s$ — sheath voltage (V) - $d_s$ — sheath thickness (m) 7.2 Ion Energy at Substrate $$ \varepsilon_{ion} \approx e V_s + \frac{1}{2} M_i v_{Bohm}^2 $$ **Bohm velocity:** $$ v_{Bohm} = \sqrt{\frac{k_B T_e}{M_i}} $$ Where: - $T_e$ — electron temperature (K or eV) 7.3 Sputtering Yield (Sigmund Formula) $$ Y(E) = \frac{3 \alpha}{4 \pi^2} \cdot \frac{4 M_1 M_2}{(M_1 + M_2)^2} \cdot \frac{E}{U_0} $$ Where: - $Y$ — sputtering yield (atoms/ion) - $\alpha$ — dimensionless factor (~0.2–0.4) - $M_1$ — incident ion mass - $M_2$ — target atom mass - $E$ — incident ion energy (eV) - $U_0$ — surface binding energy (eV) 7.4 Electron Energy Distribution Function (EEDF) The Boltzmann equation in energy space: $$ \frac{\partial f}{\partial t} + \mathbf{v} \cdot \nabla f + \frac{e \mathbf{E}}{m_e} \cdot \nabla_v f = C[f] $$ Where: - $f$ — electron energy distribution function - $\mathbf{E}$ — electric field - $m_e$ — electron mass - $C[f]$ — collision integral 8. MDP: Markov Decision Process for Process Control 8.1 MDP Formulation A Markov Decision Process is defined by the tuple: $$ \mathcal{M} = (S, A, P, R, \gamma) $$ **Components in semiconductor context:** - **State space $S$**: Film thickness, resistivity, uniformity, equipment state, wafer position - **Action space $A$**: Temperature, pressure, flow rates, RF power, deposition time - **Transition probability $P(s' | s, a)$**: Stochastic process model - **Reward function $R(s, a)$**: Yield, uniformity, throughput, quality metrics - **Discount factor $\gamma$**: Time preference (typically 0.9–0.99) 8.2 Bellman Optimality Equation $$ V^*(s) = \max_{a \in A} \left[ R(s, a) + \gamma \sum_{s'} P(s' | s, a) V^*(s') \right] $$ **Q-function formulation:** $$ Q^*(s, a) = R(s, a) + \gamma \sum_{s'} P(s' | s, a) \max_{a'} Q^*(s', a') $$ 8.3 Run-to-Run (R2R) Control Optimal recipe adjustment after each wafer: $$ \mathbf{u}_{k+1} = \mathbf{u}_k + \mathbf{K} (\mathbf{y}_{target} - \mathbf{y}_k) $$ Where: - $\mathbf{u}_k$ — process recipe parameters at run $k$ - $\mathbf{y}_k$ — measured output at run $k$ - $\mathbf{K}$ — controller gain matrix (from MDP policy optimization) 8.4 Reinforcement Learning Approaches | Method | Application | Characteristics | |--------|-------------|-----------------| | **Q-Learning** | Discrete parameter optimization | Model-free, tabular | | **Deep Q-Network (DQN)** | High-dimensional state spaces | Neural network approximation | | **Policy Gradient** | Continuous process control | Direct policy optimization | | **Actor-Critic (A2C/PPO)** | Complex control tasks | Combined value and policy | | **Model-Based RL** | Physics-informed control | Sample efficient | 9. Electrochemical Deposition (Copper Damascene) 9.1 Butler-Volmer Equation $$ i = i_0 \left[ \exp\left( \frac{\alpha_a F \eta}{RT} \right) - \exp\left( -\frac{\alpha_c F \eta}{RT} \right) \right] $$ Where: - $i$ — current density (A/m²) - $i_0$ — exchange current density (A/m²) - $\alpha_a, \alpha_c$ — anodic and cathodic transfer coefficients - $F$ — Faraday constant (96,485 C/mol) - $\eta = E - E_{eq}$ — overpotential (V) - $R$ — gas constant (J/mol·K) - $T$ — temperature (K) 9.2 Mass Transport Limited Current $$ i_L = \frac{n F D C_b}{\delta} $$ Where: - $i_L$ — limiting current density (A/m²) - $n$ — number of electrons transferred - $D$ — diffusion coefficient of Cu²⁺ (m²/s) - $C_b$ — bulk concentration (mol/m³) - $\delta$ — diffusion layer thickness (m) 9.3 Nernst-Planck Equation $$ \mathbf{J}_i = -D_i \nabla C_i - \frac{z_i F D_i}{RT} C_i \nabla \phi + C_i \mathbf{v} $$ Where: - $\mathbf{J}_i$ — flux of species $i$ - $z_i$ — charge number - $\phi$ — electric potential 9.4 Superfilling (Bottom-Up Fill) The curvature-enhanced accelerator mechanism: $$ v_n = v_0 (1 + \kappa \cdot \Gamma_{acc}) $$ Where: - $v_n$ — local growth velocity normal to surface - $v_0$ — baseline growth velocity - $\kappa$ — local surface curvature (1/m) - $\Gamma_{acc}$ — accelerator surface concentration 10. Multiscale Modeling Framework 10.1 Hierarchical Scale Integration ``` ┌──────────────────────────────────────────────────────────────┐ │ REACTOR SCALE │ │ CFD: Flow, temperature, concentration │ │ Time: seconds | Length: cm │ └─────────────────────────┬────────────────────────────────────┘ │ Boundary fluxes ▼ ┌──────────────────────────────────────────────────────────────┐ │ FEATURE SCALE │ │ Level-set / String method for surface evolution │ │ Time: seconds | Length: μm │ └─────────────────────────┬────────────────────────────────────┘ │ Local rates ▼ ┌──────────────────────────────────────────────────────────────┐ │ MESOSCALE (kMC) │ │ Kinetic Monte Carlo: nucleation, island growth │ │ Time: ms | Length: nm │ └─────────────────────────┬────────────────────────────────────┘ │ Rate parameters ▼ ┌──────────────────────────────────────────────────────────────┐ │ ATOMISTIC (MD/DFT) │ │ Molecular dynamics, ab initio: binding energies, │ │ diffusion barriers, reaction paths │ │ Time: ps | Length: Å │ └──────────────────────────────────────────────────────────────┘ ``` 10.2 Kinetic Monte Carlo (kMC) Event rate from transition state theory: $$ k_i = \nu_0 \exp\left( -\frac{E_{a,i}}{k_B T} \right) $$ Total rate and time step: $$ k_{total} = \sum_i k_i, \quad \Delta t = -\frac{\ln(r)}{k_{total}} $$ Where $r \in (0, 1]$ is a uniform random number. 10.3 Molecular Dynamics Newton's equations of motion: $$ m_i \frac{d^2 \mathbf{r}_i}{dt^2} = -\nabla_i U(\mathbf{r}_1, \mathbf{r}_2, \ldots, \mathbf{r}_N) $$ **Lennard-Jones potential:** $$ U_{LJ}(r) = 4\varepsilon \left[ \left( \frac{\sigma}{r} \right)^{12} - \left( \frac{\sigma}{r} \right)^6 \right] $$ **Embedded Atom Method (EAM) for metals:** $$ U = \sum_i F_i(\rho_i) + \frac{1}{2} \sum_{i \neq j} \phi_{ij}(r_{ij}) $$ Where $\rho_i = \sum_{j \neq i} f_j(r_{ij})$ is the electron density at atom $i$. 11. Uniformity Modeling 11.1 Wafer-Scale Thickness Distribution (Sputtering) For a circular magnetron target: $$ t(r) = \int_{target} \frac{Y \cdot J_{ion} \cdot \cos\theta_t \cdot \cos\theta_w}{\pi R^2} \, dA $$ Where: - $t(r)$ — thickness at radial position $r$ - $\theta_t$ — emission angle from target - $\theta_w$ — incidence angle at wafer 11.2 Uniformity Metrics **Within-Wafer Uniformity (WIW):** $$ \sigma_{WIW} = \frac{1}{\bar{t}} \sqrt{\frac{1}{N} \sum_{i=1}^{N} (t_i - \bar{t})^2} \times 100\% $$ **Wafer-to-Wafer Uniformity (WTW):** $$ \sigma_{WTW} = \frac{1}{\bar{t}_{avg}} \sqrt{\frac{1}{M} \sum_{j=1}^{M} (\bar{t}_j - \bar{t}_{avg})^2} \times 100\% $$ **Target specifications:** - $\sigma_{WIW} < 1\%$ for advanced nodes (≤7 nm) - $\sigma_{WTW} < 0.5\%$ for high-volume manufacturing 12. Virtual Metrology and Statistical Models 12.1 Gaussian Process Regression (GPR) $$ f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')) $$ **Squared exponential (RBF) kernel:** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left( -\frac{|\mathbf{x} - \mathbf{x}'|^2}{2\ell^2} \right) $$ **Predictive distribution:** $$ f_* | \mathbf{X}, \mathbf{y}, \mathbf{x}_* \sim \mathcal{N}(\bar{f}_*, \text{var}(f_*)) $$ 12.2 Partial Least Squares (PLS) $$ \mathbf{Y} = \mathbf{X} \mathbf{B} + \mathbf{E} $$ Where: - $\mathbf{X}$ — process parameter matrix - $\mathbf{Y}$ — quality outcome matrix - $\mathbf{B}$ — regression coefficient matrix - $\mathbf{E}$ — residual matrix 12.3 Principal Component Analysis (PCA) $$ \mathbf{X} = \mathbf{T} \mathbf{P}^T + \mathbf{E} $$ **Hotelling's $T^2$ statistic for fault detection:** $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$ 13. Process Optimization 13.1 Response Surface Methodology (RSM) **Second-order polynomial model:** $$ y = \beta_0 + \sum_{i=1}^{k} \beta_i x_i + \sum_{i=1}^{k} \beta_{ii} x_i^2 + \sum_{i < j} \beta_{ij} x_i x_j + \varepsilon $$ 13.2 Constrained Optimization $$ \min_{\mathbf{x}} f(\mathbf{x}) \quad \text{subject to} \quad g_i(\mathbf{x}) \leq 0, \quad h_j(\mathbf{x}) = 0 $$ **Example constraints:** - $g_1$: Non-uniformity ≤ 3% - $g_2$: Resistivity within spec - $g_3$: Throughput ≥ target - $h_1$: Total film thickness = target 13.3 Pareto Multi-Objective Optimization $$ \min_{\mathbf{x}} \left[ f_1(\mathbf{x}), f_2(\mathbf{x}), \ldots, f_m(\mathbf{x}) \right] $$ Common trade-offs: - Uniformity vs. throughput - Film quality vs. cost - Conformality vs. deposition rate 14. Summary: Mathematical Toolkit Reference | Domain | Key Equations | Application | |--------|---------------|-------------| | **Transport** | Navier-Stokes, Convection-Diffusion | Gas flow, precursor delivery | | **Kinetics** | Arrhenius, Langmuir-Hinshelwood | Reaction rates | | **Surface Evolution** | KPZ, Level-set, Edwards-Wilkinson | Film morphology | | **Plasma** | Boltzmann, Child-Langmuir | Ion/electron dynamics | | **Electrochemistry** | Butler-Volmer, Nernst-Planck | Copper plating | | **Control** | Bellman, MDP, RL algorithms | Recipe optimization | | **Statistics** | GPR, PLS, PCA | Virtual metrology | | **Multiscale** | MD, kMC, Continuum | Integrated simulation | 15. Key Physical Constants | Constant | Symbol | Value | Units | |----------|--------|-------|-------| | Boltzmann constant | $k_B$ | $1.38 \times 10^{-23}$ | J/K | | Gas constant | $R$ | $8.314$ | J/(mol·K) | | Faraday constant | $F$ | $96,485$ | C/mol | | Elementary charge | $e$ | $1.60 \times 10^{-19}$ | C | | Vacuum permittivity | $\varepsilon_0$ | $8.85 \times 10^{-12}$ | F/m | | Avogadro's number | $N_A$ | $6.02 \times 10^{23}$ | mol⁻¹ | | Electron mass | $m_e$ | $9.11 \times 10^{-31}$ | kg |

pvd,thin film,physical vapor deposition

Deposit conductor films (Al Cu Ti TiN Ta TaN W etc).