← Back to AI Factory Chat

AI Factory Glossary

751 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 7 of 16 (751 entries)

metapath2vec, graph neural networks

Metapath2vec learns embeddings in heterogeneous graphs through metapath-guided random walks.

metaphor detection, nlp

Identify metaphorical language.

metaqnn, neural architecture search

Meta Q-Network applies Q-learning to neural architecture search representing architectures as state sequences for discrete action spaces.

metastability, design & verification

Metastability occurs when signals violate flip-flop timing causing indeterminate state.

meteor, meteor, evaluation

Enhanced BLEU with synonyms.

meteor, meteor, evaluation

METEOR aligns hypothesis and reference using stems and synonyms.

meteor,evaluation

Translation metric using synonyms and stemming.

meter and rhythm,content creation

Control poetic rhythm.

method name prediction, code ai

Suggest method names from implementation.

metric logging, mlops

Track training and validation metrics.

metrics collection,mlops

Gather numerical measurements of system.

metrology lab,metrology

Controlled environment for precise measurements.

metrology science, metrology physics, ellipsometry, scatterometry, OCD metrology, CD-

# Semiconductor Manufacturing Process Metrology: Science, Mathematics, and Modeling A comprehensive exploration of the physics, mathematics, and computational methods underlying nanoscale measurement in semiconductor fabrication. ## 1. The Fundamental Challenge Modern semiconductor manufacturing produces structures with critical dimensions of just a few nanometers. At leading-edge nodes (3nm, 2nm), we are measuring features only **10–20 atoms wide**. ### Key Requirements - **Sub-angstrom precision** in measurement - **Complex 3D architectures**: FinFETs, Gate-All-Around (GAA) transistors, 3D NAND (200+ layers) - **High throughput**: seconds per measurement in production - **Multi-parameter extraction**: distinguish dozens of correlated parameters ### Metrology Techniques Overview | Technique | Principle | Resolution | Throughput | |-----------|-----------|------------|------------| | Spectroscopic Ellipsometry (SE) | Polarization change | ~0.1 Å | High | | Optical CD (OCD/Scatterometry) | Diffraction analysis | ~0.1 nm | High | | CD-SEM | Electron imaging | ~1 nm | Medium | | CD-SAXS | X-ray scattering | ~0.1 nm | Low | | AFM | Probe scanning | ~0.1 nm | Low | | TEM | Electron transmission | Atomic | Very Low | ## 2. Physics Foundation ### 2.1 Maxwell's Equations At the heart of optical metrology lies the solution to Maxwell's equations: $$ \nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t} $$ $$ \nabla \times \mathbf{H} = \mathbf{J} + \frac{\partial \mathbf{D}}{\partial t} $$ $$ \nabla \cdot \mathbf{D} = \rho $$ $$ \nabla \cdot \mathbf{B} = 0 $$ Where: - $\mathbf{E}$ = Electric field vector - $\mathbf{H}$ = Magnetic field vector - $\mathbf{D}$ = Electric displacement field - $\mathbf{B}$ = Magnetic flux density - $\mathbf{J}$ = Current density - $\rho$ = Charge density ### 2.2 Constitutive Relations For linear, isotropic media: $$ \mathbf{D} = \varepsilon_0 \varepsilon_r \mathbf{E} = \varepsilon_0 (1 + \chi_e) \mathbf{E} $$ $$ \mathbf{B} = \mu_0 \mu_r \mathbf{H} $$ The complex dielectric function: $$ \tilde{\varepsilon}(\omega) = \varepsilon_1(\omega) + i\varepsilon_2(\omega) = \tilde{n}^2 = (n + ik)^2 $$ Where: - $n$ = Refractive index - $k$ = Extinction coefficient ### 2.3 Fresnel Equations At an interface between media with refractive indices $\tilde{n}_1$ and $\tilde{n}_2$: **s-polarization (TE):** $$ r_s = \frac{n_1 \cos\theta_i - n_2 \cos\theta_t}{n_1 \cos\theta_i + n_2 \cos\theta_t} $$ $$ t_s = \frac{2 n_1 \cos\theta_i}{n_1 \cos\theta_i + n_2 \cos\theta_t} $$ **p-polarization (TM):** $$ r_p = \frac{n_2 \cos\theta_i - n_1 \cos\theta_t}{n_2 \cos\theta_i + n_1 \cos\theta_t} $$ $$ t_p = \frac{2 n_1 \cos\theta_i}{n_2 \cos\theta_i + n_1 \cos\theta_t} $$ With Snell's law: $$ n_1 \sin\theta_i = n_2 \sin\theta_t $$ ## 3. Mathematics of Inverse Problems ### 3.1 Problem Formulation Metrology is fundamentally an **inverse problem**: | Problem Type | Description | Well-Posed? | |--------------|-------------|-------------| | **Forward** | Structure parameters → Measured signal | Yes | | **Inverse** | Measured signal → Structure parameters | Often No | We seek parameters $\mathbf{p}$ that minimize the difference between model $M(\mathbf{p})$ and data $\mathbf{D}$: $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 $$ Or with weighted least squares: $$ \chi^2 = \sum_{k=1}^{N} \frac{\left( M_k(\mathbf{p}) - D_k \right)^2}{\sigma_k^2} $$ ### 3.2 Levenberg-Marquardt Algorithm The workhorse optimization algorithm interpolates between gradient descent and Gauss-Newton: $$ \left( \mathbf{J}^T \mathbf{J} + \lambda \mathbf{I} \right) \delta\mathbf{p} = \mathbf{J}^T \left( \mathbf{D} - M(\mathbf{p}) \right) $$ Where: - $\mathbf{J}$ = Jacobian matrix (sensitivity matrix) - $\lambda$ = Damping parameter - $\delta\mathbf{p}$ = Parameter update step The Jacobian elements: $$ J_{ij} = \frac{\partial M_i}{\partial p_j} $$ **Algorithm behavior:** - Large $\lambda$ → Gradient descent (robust, slow) - Small $\lambda$ → Gauss-Newton (fast near minimum) ### 3.3 Regularization Techniques For ill-posed problems, regularization is essential: **Tikhonov Regularization (L2):** $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 + \alpha \left\| \mathbf{p} - \mathbf{p}_0 \right\|^2 $$ **LASSO Regularization (L1):** $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 + \alpha \left\| \mathbf{p} \right\|_1 $$ **Bayesian Inference:** $$ P(\mathbf{p} | \mathbf{D}) = \frac{P(\mathbf{D} | \mathbf{p}) \cdot P(\mathbf{p})}{P(\mathbf{D})} $$ Where: - $P(\mathbf{p} | \mathbf{D})$ = Posterior probability - $P(\mathbf{D} | \mathbf{p})$ = Likelihood - $P(\mathbf{p})$ = Prior probability ## 4. Thin Film Optics ### 4.1 Ellipsometry Fundamentals Ellipsometry measures the change in polarization state upon reflection: $$ \rho = \tan(\Psi) \cdot e^{i\Delta} = \frac{r_p}{r_s} $$ Where: - $\Psi$ = Amplitude ratio angle - $\Delta$ = Phase difference - $r_p, r_s$ = Complex reflection coefficients ### 4.2 Transfer Matrix Method For multilayer stacks, the characteristic matrix for layer $j$: $$ \mathbf{M}_j = \begin{pmatrix} \cos\delta_j & \frac{i \sin\delta_j}{\eta_j} \\ i\eta_j \sin\delta_j & \cos\delta_j \end{pmatrix} $$ Where the phase thickness: $$ \delta_j = \frac{2\pi}{\lambda} \tilde{n}_j d_j \cos\theta_j $$ And the optical admittance: $$ \eta_j = \begin{cases} \tilde{n}_j \cos\theta_j & \text{(s-pol)} \\ \frac{\tilde{n}_j}{\cos\theta_j} & \text{(p-pol)} \end{cases} $$ **Total system matrix:** $$ \mathbf{M}_{total} = \mathbf{M}_1 \cdot \mathbf{M}_2 \cdot \ldots \cdot \mathbf{M}_N = \begin{pmatrix} m_{11} & m_{12} \\ m_{21} & m_{22} \end{pmatrix} $$ **Reflection coefficient:** $$ r = \frac{\eta_0 m_{11} + \eta_0 \eta_s m_{12} - m_{21} - \eta_s m_{22}}{\eta_0 m_{11} + \eta_0 \eta_s m_{12} + m_{21} + \eta_s m_{22}} $$ ### 4.3 Dispersion Models **Lorentz Oscillator Model:** $$ \varepsilon(\omega) = \varepsilon_\infty + \sum_j \frac{A_j}{\omega_j^2 - \omega^2 - i\gamma_j \omega} $$ **Tauc-Lorentz Model (for amorphous semiconductors):** $$ \varepsilon_2(E) = \begin{cases} \frac{A E_0 C (E - E_g)^2}{(E^2 - E_0^2)^2 + C^2 E^2} \cdot \frac{1}{E} & E > E_g \\ 0 & E \leq E_g \end{cases} $$ With $\varepsilon_1$ obtained via Kramers-Kronig relations: $$ \varepsilon_1(E) = \varepsilon_{1,\infty} + \frac{2}{\pi} \mathcal{P} \int_{E_g}^{\infty} \frac{\xi \varepsilon_2(\xi)}{\xi^2 - E^2} d\xi $$ ## 5. Scatterometry and RCWA ### 5.1 Rigorous Coupled-Wave Analysis For a grating with period $\Lambda$, electromagnetic fields are expanded in Fourier orders: $$ E(x,z) = \sum_{m=-M}^{M} E_m(z) \exp(i k_{xm} x) $$ Where the diffracted wave vectors: $$ k_{xm} = k_{x0} + \frac{2\pi m}{\Lambda} = k_0 \left( n_1 \sin\theta_i + \frac{m\lambda}{\Lambda} \right) $$ ### 5.2 Eigenvalue Problem In each layer, the field satisfies: $$ \frac{d^2 \mathbf{E}}{dz^2} = \mathbf{\Omega}^2 \mathbf{E} $$ Where $\mathbf{\Omega}^2$ is a matrix determined by the Fourier components of the permittivity: $$ \varepsilon(x) = \sum_n \varepsilon_n \exp\left( i \frac{2\pi n}{\Lambda} x \right) $$ The eigenvalue decomposition: $$ \mathbf{\Omega}^2 = \mathbf{W} \mathbf{\Lambda} \mathbf{W}^{-1} $$ Provides propagation constants (eigenvalues $\lambda_m$) and field profiles (eigenvectors in $\mathbf{W}$). ### 5.3 S-Matrix Formulation For numerical stability, use the scattering matrix formulation: $$ \begin{pmatrix} \mathbf{a}_1^- \\ \mathbf{a}_N^+ \end{pmatrix} = \mathbf{S} \begin{pmatrix} \mathbf{a}_1^+ \\ \mathbf{a}_N^- \end{pmatrix} $$ Where $\mathbf{a}^+$ and $\mathbf{a}^-$ represent forward and backward propagating waves. The S-matrix is built recursively: $$ \mathbf{S}_{1 \to j+1} = \mathbf{S}_{1 \to j} \star \mathbf{S}_{j,j+1} $$ Using the Redheffer star product $\star$. ## 6. Statistical Process Control ### 6.1 Control Charts **$\bar{X}$ Chart (Mean):** $$ UCL = \bar{\bar{X}} + A_2 \bar{R} $$ $$ LCL = \bar{\bar{X}} - A_2 \bar{R} $$ **R Chart (Range):** $$ UCL_R = D_4 \bar{R} $$ $$ LCL_R = D_3 \bar{R} $$ **EWMA (Exponentially Weighted Moving Average):** $$ Z_t = \lambda X_t + (1 - \lambda) Z_{t-1} $$ With control limits: $$ UCL = \mu_0 + L \sigma \sqrt{\frac{\lambda}{2 - \lambda} \left[ 1 - (1-\lambda)^{2t} \right]} $$ ### 6.2 Process Capability Indices **$C_p$ (Process Capability):** $$ C_p = \frac{USL - LSL}{6\sigma} $$ **$C_{pk}$ (Centered Process Capability):** $$ C_{pk} = \min \left( \frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma} \right) $$ **$C_{pm}$ (Taguchi Capability):** $$ C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}} $$ Where: - $USL$ = Upper Specification Limit - $LSL$ = Lower Specification Limit - $T$ = Target value - $\mu$ = Process mean - $\sigma$ = Process standard deviation ### 6.3 Gauge R&R Analysis Total measurement variance decomposition: $$ \sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge} $$ $$ \sigma^2_{gauge} = \sigma^2_{repeatability} + \sigma^2_{reproducibility} $$ **Precision-to-Tolerance Ratio:** $$ P/T = \frac{6 \sigma_{gauge}}{USL - LSL} \times 100\% $$ | P/T Ratio | Assessment | |-----------|------------| | < 10% | Excellent | | 10-30% | Acceptable | | > 30% | Unacceptable | ## 7. Uncertainty Quantification ### 7.1 Fisher Information Matrix The Fisher Information Matrix for parameter estimation: $$ F_{ij} = \sum_{k=1}^{N} \frac{1}{\sigma_k^2} \frac{\partial M_k}{\partial p_i} \frac{\partial M_k}{\partial p_j} $$ Or equivalently: $$ F_{ij} = -E \left[ \frac{\partial^2 \ln L}{\partial p_i \partial p_j} \right] $$ Where $L$ is the likelihood function. ### 7.2 Cramér-Rao Lower Bound The covariance matrix of any unbiased estimator is bounded: $$ \text{Cov}(\hat{\mathbf{p}}) \geq \mathbf{F}^{-1} $$ For a single parameter: $$ \text{Var}(\hat{\theta}) \geq \frac{1}{I(\theta)} $$ **Interpretation:** - Diagonal elements of $\mathbf{F}^{-1}$ give minimum variance for each parameter - Off-diagonal elements indicate parameter correlations - Large condition number of $\mathbf{F}$ indicates ill-conditioning ### 7.3 Correlation Coefficient $$ \rho_{ij} = \frac{F^{-1}_{ij}}{\sqrt{F^{-1}_{ii} F^{-1}_{jj}}} $$ | |$\rho$| | Interpretation | |--------|----------------| | < 0.3 | Weak correlation | | 0.3 – 0.7 | Moderate correlation | | > 0.7 | Strong correlation | | > 0.95 | Severe: consider fixing one parameter | ### 7.4 GUM Framework According to the Guide to the Expression of Uncertainty in Measurement: **Combined standard uncertainty:** $$ u_c^2(y) = \sum_{i=1}^{N} \left( \frac{\partial f}{\partial x_i} \right)^2 u^2(x_i) + 2 \sum_{i=1}^{N-1} \sum_{j=i+1}^{N} \frac{\partial f}{\partial x_i} \frac{\partial f}{\partial x_j} u(x_i, x_j) $$ **Expanded uncertainty:** $$ U = k \cdot u_c(y) $$ Where $k$ is the coverage factor (typically $k=2$ for 95% confidence). ## 8. Machine Learning in Metrology ### 8.1 Neural Network Surrogate Models Replace expensive physics simulations with trained neural networks: $$ M_{NN}(\mathbf{p}; \mathbf{W}) \approx M_{physics}(\mathbf{p}) $$ **Training objective:** $$ \mathcal{L} = \frac{1}{N} \sum_{i=1}^{N} \left\| M_{NN}(\mathbf{p}_i) - M_{physics}(\mathbf{p}_i) \right\|^2 + \lambda \left\| \mathbf{W} \right\|^2 $$ **Speedup:** Typically $10^4$ – $10^6 \times$ faster than RCWA/FEM. ### 8.2 Physics-Informed Neural Networks (PINNs) Incorporate physical laws into the loss function: $$ \mathcal{L}_{total} = \mathcal{L}_{data} + \lambda_{physics} \mathcal{L}_{physics} $$ Where: $$ \mathcal{L}_{physics} = \left\| \nabla \times \mathbf{E} + \frac{\partial \mathbf{B}}{\partial t} \right\|^2 + \ldots $$ ### 8.3 Gaussian Process Regression A non-parametric Bayesian approach: $$ f(\mathbf{x}) \sim \mathcal{GP}\left( m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}') \right) $$ **Common kernel (RBF/Squared Exponential):** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left( -\frac{\left\| \mathbf{x} - \mathbf{x}' \right\|^2}{2\ell^2} \right) $$ **Posterior prediction:** $$ \mu_* = \mathbf{k}_*^T (\mathbf{K} + \sigma_n^2 \mathbf{I})^{-1} \mathbf{y} $$ $$ \sigma_*^2 = k_{**} - \mathbf{k}_*^T (\mathbf{K} + \sigma_n^2 \mathbf{I})^{-1} \mathbf{k}_* $$ **Advantages:** - Provides uncertainty estimates naturally - Works well with limited training data - Interpretable hyperparameters ### 8.4 Virtual Metrology Predict wafer properties from equipment sensor data: $$ \hat{y} = f(FDC_1, FDC_2, \ldots, FDC_n) $$ Where $FDC_i$ are Fault Detection and Classification sensor readings. **Common approaches:** - Partial Least Squares (PLS) regression - Random Forests - Gradient Boosting (XGBoost, LightGBM) - Deep neural networks ## 9. Advanced Topics and Frontiers ### 9.1 3D Metrology Challenges Modern structures require 3D measurement: | Structure | Complexity | Key Challenge | |-----------|------------|---------------| | FinFET | Moderate | Fin height, sidewall angle | | GAA/Nanosheet | High | Sheet thickness, spacing | | 3D NAND | Very High | 200+ layers, bowing, tilt | | DRAM HAR | Extreme | 100:1 aspect ratio structures | ### 9.2 Hybrid Metrology Combining multiple techniques to break parameter correlations: $$ \chi^2_{total} = \sum_{techniques} w_t \chi^2_t $$ **Example combination:** - OCD for periodic structure parameters - Ellipsometry for film optical constants - XRR for density and interface roughness **Mathematical framework:** $$ \mathbf{F}_{hybrid} = \sum_t \mathbf{F}_t $$ Reduces off-diagonal elements, improving condition number. ### 9.3 Atomic-Scale Considerations At the 2nm node and beyond: **Line Edge Roughness (LER):** $$ \sigma_{LER} = \sqrt{\frac{1}{L} \int_0^L \left[ x(z) - \bar{x} \right]^2 dz} $$ **Power Spectral Density:** $$ PSD(f) = \frac{\sigma^2 \xi}{1 + (2\pi f \xi)^{2(1+H)}} $$ Where: - $\xi$ = Correlation length - $H$ = Hurst exponent (roughness character) **Quantum Effects:** - Tunneling through thin barriers - Discrete dopant effects - Wave function penetration ### 9.4 Model-Measurement Circularity A fundamental epistemological challenge: ``` - ┌──────────────┐ ┌──────────────┐ │ Physical │ ───► │ Measured │ │ Structure │ │ Signal │ └──────────────┘ └──────────────┘ ▲ │ │ ▼ │ ┌──────────────┐ │ │ Model │ └────────────◄─┤ Inversion │ └──────────────┘ ``` **Key questions:** - How do we validate models when "truth" requires modeling? - Reference metrology (TEM) also requires interpretation - What does it mean to "know" a dimension at atomic scale? ## Key Symbols and Notation | Symbol | Description | Units | |--------|-------------|-------| | $\lambda$ | Wavelength | nm | | $\theta$ | Angle of incidence | degrees | | $n$ | Refractive index | dimensionless | | $k$ | Extinction coefficient | dimensionless | | $d$ | Film thickness | nm | | $\Lambda$ | Grating period | nm | | $\Psi, \Delta$ | Ellipsometric angles | degrees | | $\sigma$ | Standard deviation | varies | | $\mathbf{J}$ | Jacobian matrix | varies | | $\mathbf{F}$ | Fisher Information Matrix | varies | ## Computational Complexity | Method | Complexity | Typical Time | |--------|------------|--------------| | Transfer Matrix | $O(N)$ | $\mu$s | | RCWA | $O(M^3 \cdot L)$ | ms – s | | FEM | $O(N^{1.5})$ | s – min | | FDTD | $O(N \cdot T)$ | s – min | | Monte Carlo (SEM) | $O(N_{electrons})$ | min – hr | | Neural Network (inference) | $O(1)$ | $\mu$s | Where: - $N$ = Number of layers / mesh elements - $M$ = Number of Fourier orders - $L$ = Number of layers - $T$ = Number of time steps

metrology, scatterometry, ellipsometry, x-ray reflectometry, inverse problems, optimization, statistical inference, mathematical modeling

# Semiconductor Manufacturing Process Metrology: Mathematical Modeling ## 1. The Core Problem Structure Semiconductor metrology faces a fundamental **inverse problem**: we make indirect measurements (optical spectra, scattered X-rays, electron signals) and must infer physical quantities (dimensions, compositions, defect states) that we cannot directly observe at the nanoscale. ### 1.1 Mathematical Formulation The general measurement model: $$ \mathbf{y} = \mathcal{F}(\mathbf{p}) + \boldsymbol{\epsilon} $$ **Variable Definitions:** - $\mathbf{y}$ — measured signal vector (spectrum, image intensity, scattered amplitude) - $\mathbf{p}$ — physical parameters of interest (CD, thickness, sidewall angle, composition) - $\mathcal{F}$ — forward model operator (physics of measurement process) - $\boldsymbol{\epsilon}$ — noise/uncertainty term ### 1.2 Key Mathematical Challenges - **Nonlinearity:** $\mathcal{F}$ is typically highly nonlinear - **Computational cost:** Forward model evaluation is expensive - **Ill-posedness:** Inverse may be non-unique or unstable - **High dimensionality:** Many parameters from limited measurements ## 2. Optical Critical Dimension (OCD) / Scatterometry This is the most mathematically intensive metrology technique in high-volume manufacturing. ### 2.1 Forward Problem: Electromagnetic Scattering For periodic structures (gratings, arrays), solve Maxwell's equations with Floquet-Bloch boundary conditions. #### 2.1.1 Maxwell's Equations $$ \nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t} $$ $$ \nabla \times \mathbf{H} = \mathbf{J} + \frac{\partial \mathbf{D}}{\partial t} $$ #### 2.1.2 Rigorous Coupled Wave Analysis (RCWA) **Field Expansion in Fourier Series:** The electric field in layer $j$ with grating vector $\mathbf{K}$: $$ \mathbf{E}(\mathbf{r}) = \sum_{n=-N}^{N} \mathbf{E}_n^{(j)} \exp\left(i(\mathbf{k}_n \cdot \mathbf{r})\right) $$ where the diffraction wave vectors are: $$ \mathbf{k}_n = \mathbf{k}_0 + n\mathbf{K} $$ **Key Properties:** - Converts PDEs to eigenvalue problem - Matches boundary conditions at layer interfaces - Computational complexity: $O(N^3)$ where $N$ = number of Fourier orders ### 2.2 Inverse Problem: Parameter Extraction Given measured spectra $R(\lambda, \theta)$, find best-fit parameters $\mathbf{p}$. #### 2.2.1 Optimization Formulation $$ \hat{\mathbf{p}} = \arg\min_{\mathbf{p}} \left\| \mathbf{y}_{\text{meas}} - \mathcal{F}(\mathbf{p}) \right\|^2 + \lambda R(\mathbf{p}) $$ **Regularization Options:** - **Tikhonov regularization:** $$ R(\mathbf{p}) = \left\| \mathbf{p} - \mathbf{p}_0 \right\|^2 $$ - **Sparsity-promoting (L1):** $$ R(\mathbf{p}) = \left\| \mathbf{p} \right\|_1 $$ - **Total variation:** $$ R(\mathbf{p}) = \int |\nabla \mathbf{p}| \, d\mathbf{x} $$ #### 2.2.2 Library-Based Approach 1. **Precomputation:** Generate forward model on dense parameter grid 2. **Storage:** Build library with millions of entries 3. **Search:** Find best match using regression methods **Regression Methods:** - Polynomial regression — fast but limited accuracy - Neural networks — handle nonlinearity well - Gaussian process regression — provides uncertainty estimates ### 2.3 Parameter Correlations and Uncertainty #### 2.3.1 Fisher Information Matrix $$ [\mathbf{I}(\mathbf{p})]_{ij} = \mathbb{E}\left[\frac{\partial \ln L}{\partial p_i}\frac{\partial \ln L}{\partial p_j}\right] $$ #### 2.3.2 Cramér-Rao Lower Bound $$ \text{Var}(\hat{p}_i) \geq \left[\mathbf{I}^{-1}\right]_{ii} $$ **Physical Interpretation:** Strong correlations (e.g., height vs. sidewall angle) manifest as near-singular information matrices—a fundamental limit on independent resolution. ## 3. Thin Film Metrology: Ellipsometry ### 3.1 Physical Model Ellipsometry measures polarization state change upon reflection: $$ \rho = \frac{r_p}{r_s} = \tan(\Psi)\exp(i\Delta) $$ **Variables:** - $r_p$ — p-polarized reflection coefficient - $r_s$ — s-polarized reflection coefficient - $\Psi$ — amplitude ratio angle - $\Delta$ — phase difference ### 3.2 Transfer Matrix Formalism For multilayer stacks: $$ \mathbf{M} = \prod_{j=1}^{N} \mathbf{M}_j = \prod_{j=1}^{N} \begin{pmatrix} \cos\delta_j & \dfrac{i\sin\delta_j}{\eta_j} \\[10pt] i\eta_j\sin\delta_j & \cos\delta_j \end{pmatrix} $$ where the phase thickness is: $$ \delta_j = \frac{2\pi}{\lambda} n_j d_j \cos(\theta_j) $$ **Parameters:** - $n_j$ — refractive index of layer $j$ - $d_j$ — thickness of layer $j$ - $\theta_j$ — angle of propagation in layer $j$ - $\eta_j$ — optical admittance ### 3.3 Dispersion Models #### 3.3.1 Cauchy Model (Transparent Materials) $$ n(\lambda) = A + \frac{B}{\lambda^2} + \frac{C}{\lambda^4} $$ #### 3.3.2 Sellmeier Equation $$ n^2(\lambda) = 1 + \sum_{i} \frac{B_i \lambda^2}{\lambda^2 - C_i} $$ #### 3.3.3 Tauc-Lorentz Model (Amorphous Semiconductors) $$ \varepsilon_2(E) = \begin{cases} \dfrac{A E_0 C (E - E_g)^2}{(E^2 - E_0^2)^2 + C^2 E^2} \cdot \dfrac{1}{E} & E > E_g \\[10pt] 0 & E \leq E_g \end{cases} $$ with $\varepsilon_1$ derived via Kramers-Kronig relations: $$ \varepsilon_1(E) = \varepsilon_{1\infty} + \frac{2}{\pi} \mathcal{P} \int_0^\infty \frac{\xi \varepsilon_2(\xi)}{\xi^2 - E^2} d\xi $$ #### 3.3.4 Drude Model (Metals/Conductors) $$ \varepsilon(\omega) = \varepsilon_\infty - \frac{\omega_p^2}{\omega^2 + i\gamma\omega} $$ **Parameters:** - $\omega_p$ — plasma frequency - $\gamma$ — damping coefficient - $\varepsilon_\infty$ — high-frequency dielectric constant ## 4. X-ray Metrology Mathematics ### 4.1 X-ray Reflectivity (XRR) #### 4.1.1 Parratt Recursion Formula For specular reflection at grazing incidence: $$ R_j = \frac{r_{j,j+1} + R_{j+1}\exp(2ik_{z,j+1}d_{j+1})}{1 + r_{j,j+1}R_{j+1}\exp(2ik_{z,j+1}d_{j+1})} $$ where $r_{j,j+1}$ is the Fresnel coefficient at interface $j$. #### 4.1.2 Roughness Correction (Névot-Croce Factor) $$ r'_{j,j+1} = r_{j,j+1} \exp\left(-2k_{z,j}k_{z,j+1}\sigma_j^2\right) $$ **Parameters:** - $k_{z,j}$ — perpendicular wave vector component in layer $j$ - $\sigma_j$ — RMS roughness at interface $j$ ### 4.2 CD-SAXS (Critical Dimension Small Angle X-ray Scattering) #### 4.2.1 Scattering Intensity For transmission scattering from 3D nanostructures: $$ I(\mathbf{q}) = \left|\tilde{\rho}(\mathbf{q})\right|^2 = \left|\int \Delta\rho(\mathbf{r})\exp(-i\mathbf{q}\cdot\mathbf{r})d^3\mathbf{r}\right|^2 $$ #### 4.2.2 Form Factor for Simple Shapes **Rectangular parallelepiped:** $$ F(\mathbf{q}) = V \cdot \text{sinc}\left(\frac{q_x a}{2}\right) \cdot \text{sinc}\left(\frac{q_y b}{2}\right) \cdot \text{sinc}\left(\frac{q_z c}{2}\right) $$ **Cylinder:** $$ F(\mathbf{q}) = 2\pi R^2 L \cdot \frac{J_1(q_\perp R)}{q_\perp R} \cdot \text{sinc}\left(\frac{q_z L}{2}\right) $$ where $J_1$ is the first-order Bessel function. ## 5. Statistical Process Control Mathematics ### 5.1 Virtual Metrology Predict wafer properties from tool sensor data without direct measurement: $$ y = f(\mathbf{x}) + \varepsilon $$ #### 5.1.1 Partial Least Squares (PLS) Handles high-dimensional, correlated inputs: 1. Find latent variables: $\mathbf{T} = \mathbf{X}\mathbf{W}$ 2. Maximize covariance with $y$ 3. Model: $y = \mathbf{T}\mathbf{Q} + e$ **Optimization objective:** $$ \max_{\mathbf{w}} \text{Cov}(\mathbf{X}\mathbf{w}, y)^2 \quad \text{subject to} \quad \|\mathbf{w}\| = 1 $$ #### 5.1.2 Gaussian Process Regression $$ y(\mathbf{x}) \sim \mathcal{GP}\left(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')\right) $$ **Common Kernel Functions:** - **Squared Exponential (RBF):** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left(-\frac{\|\mathbf{x} - \mathbf{x}'\|^2}{2\ell^2}\right) $$ - **Matérn 5/2:** $$ k(r) = \sigma_f^2 \left(1 + \frac{\sqrt{5}r}{\ell} + \frac{5r^2}{3\ell^2}\right) \exp\left(-\frac{\sqrt{5}r}{\ell}\right) $$ ### 5.2 Run-to-Run Control #### 5.2.1 EWMA Controller $$ \hat{d}_t = \lambda y_{t-1} + (1-\lambda)\hat{d}_{t-1} $$ $$ x_t = x_{\text{nom}} - \frac{\hat{d}_t}{\hat{\beta}} $$ **Parameters:** - $\lambda$ — smoothing factor (typically 0.2–0.4) - $\hat{\beta}$ — estimated process gain - $x_{\text{nom}}$ — nominal recipe setting #### 5.2.2 Model Predictive Control (MPC) $$ \min_{\mathbf{u}} \sum_{k=0}^{N} \left\| y_{t+k} - y_{\text{target}} \right\|_Q^2 + \left\| \Delta u_{t+k} \right\|_R^2 $$ subject to: - Process dynamics: $\mathbf{x}_{t+1} = \mathbf{A}\mathbf{x}_t + \mathbf{B}\mathbf{u}_t$ - Output equation: $y_t = \mathbf{C}\mathbf{x}_t$ - Constraints: $\mathbf{u}_{\min} \leq \mathbf{u}_t \leq \mathbf{u}_{\max}$ ### 5.3 Wafer-Level Spatial Modeling #### 5.3.1 Zernike Polynomial Decomposition $$ W(r,\theta) = \sum_{n=0}^{N} \sum_{m=-n}^{n} a_{nm} Z_n^m(r,\theta) $$ **First few Zernike polynomials:** | Index | Name | Formula | |-------|------|---------| | $Z_0^0$ | Piston | $1$ | | $Z_1^{-1}$ | Tilt Y | $2r\sin\theta$ | | $Z_1^1$ | Tilt X | $2r\cos\theta$ | | $Z_2^0$ | Defocus | $\sqrt{3}(2r^2-1)$ | | $Z_2^{-2}$ | Astigmatism | $\sqrt{6}r^2\sin2\theta$ | | $Z_2^2$ | Astigmatism | $\sqrt{6}r^2\cos2\theta$ | #### 5.3.2 Gaussian Random Fields For spatially correlated residuals: $$ \text{Cov}\left(W(\mathbf{s}_1), W(\mathbf{s}_2)\right) = \sigma^2 \rho\left(\|\mathbf{s}_1 - \mathbf{s}_2\|; \phi\right) $$ **Common correlation functions:** - **Exponential:** $$ \rho(h) = \exp\left(-\frac{h}{\phi}\right) $$ - **Gaussian:** $$ \rho(h) = \exp\left(-\frac{h^2}{\phi^2}\right) $$ ## 6. Overlay Metrology Mathematics ### 6.1 Higher-Order Correction Models Overlay error as polynomial expansion: $$ \delta x = T_x + M_x \cdot x + R_x \cdot y + \sum_{i+j \leq n} c_{ij}^x x^i y^j $$ $$ \delta y = T_y + M_y \cdot y + R_y \cdot x + \sum_{i+j \leq n} c_{ij}^y x^i y^j $$ **Physical interpretation of linear terms:** - $T_x, T_y$ — Translation - $M_x, M_y$ — Magnification - $R_x, R_y$ — Rotation ### 6.2 Sampling Strategy Optimization #### 6.2.1 D-Optimal Design $$ \mathbf{s}^* = \arg\max_{\mathbf{s}} \det\left(\mathbf{X}_s^T \mathbf{X}_s\right) $$ Minimizes the volume of the confidence ellipsoid for parameter estimates. #### 6.2.2 Information-Theoretic Approach Maximize expected information gain: $$ I(\mathbf{s}) = H(\mathbf{p}) - \mathbb{E}_{\mathbf{y}}\left[H(\mathbf{p}|\mathbf{y})\right] $$ ## 7. Machine Learning Integration ### 7.1 Physics-Informed Neural Networks (PINNs) Combine data fitting with physical constraints: $$ \mathcal{L} = \mathcal{L}_{\text{data}} + \lambda \mathcal{L}_{\text{physics}} $$ **Components:** - **Data loss:** $$ \mathcal{L}_{\text{data}} = \frac{1}{N} \sum_{i=1}^{N} \left\| y_i - f_\theta(\mathbf{x}_i) \right\|^2 $$ - **Physics loss (example: Maxwell residual):** $$ \mathcal{L}_{\text{physics}} = \frac{1}{M} \sum_{j=1}^{M} \left\| \nabla \times \mathbf{E}_\theta - i\omega\mu\mathbf{H}_\theta \right\|^2 $$ ### 7.2 Neural Network Surrogates **Architecture for forward model approximation:** - **Input:** Geometric parameters $\mathbf{p} \in \mathbb{R}^d$ - **Hidden layers:** Multiple fully-connected layers with ReLU/GELU activation - **Output:** Simulated spectrum $\mathbf{y} \in \mathbb{R}^m$ **Speedup:** $10^4$ – $10^6\times$ over rigorous simulation ### 7.3 Deep Learning for Defect Detection **Methods:** - **CNNs** — Classification and localization - **Autoencoders** — Anomaly detection via reconstruction error: $$ \text{Score}(\mathbf{x}) = \left\| \mathbf{x} - D(E(\mathbf{x})) \right\|^2 $$ - **Instance segmentation** — Precise defect boundary delineation ## 8. Uncertainty Quantification ### 8.1 GUM Framework (Guide to Uncertainty in Measurement) Combined standard uncertainty: $$ u_c^2(y) = \sum_{i} \left(\frac{\partial f}{\partial x_i}\right)^2 u^2(x_i) + 2\sum_{i

metrology, semiconductor metrology, measurement, characterization, ellipsometry, scatterometry

# Semiconductor Manufacturing Process Metrology: Science, Mathematics, and Modeling A comprehensive exploration of the physics, mathematics, and computational methods underlying nanoscale measurement in semiconductor fabrication. ## 1. The Fundamental Challenge Modern semiconductor manufacturing produces structures with critical dimensions of just a few nanometers. At leading-edge nodes (3nm, 2nm), we are measuring features only **10–20 atoms wide**. ### Key Requirements - **Sub-angstrom precision** in measurement - **Complex 3D architectures**: FinFETs, Gate-All-Around (GAA) transistors, 3D NAND (200+ layers) - **High throughput**: seconds per measurement in production - **Multi-parameter extraction**: distinguish dozens of correlated parameters ### Metrology Techniques Overview | Technique | Principle | Resolution | Throughput | |-----------|-----------|------------|------------| | Spectroscopic Ellipsometry (SE) | Polarization change | ~0.1 Å | High | | Optical CD (OCD/Scatterometry) | Diffraction analysis | ~0.1 nm | High | | CD-SEM | Electron imaging | ~1 nm | Medium | | CD-SAXS | X-ray scattering | ~0.1 nm | Low | | AFM | Probe scanning | ~0.1 nm | Low | | TEM | Electron transmission | Atomic | Very Low | ## 2. Physics Foundation ### 2.1 Maxwell's Equations At the heart of optical metrology lies the solution to Maxwell's equations: $$ \nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t} $$ $$ \nabla \times \mathbf{H} = \mathbf{J} + \frac{\partial \mathbf{D}}{\partial t} $$ $$ \nabla \cdot \mathbf{D} = \rho $$ $$ \nabla \cdot \mathbf{B} = 0 $$ Where: - $\mathbf{E}$ = Electric field vector - $\mathbf{H}$ = Magnetic field vector - $\mathbf{D}$ = Electric displacement field - $\mathbf{B}$ = Magnetic flux density - $\mathbf{J}$ = Current density - $\rho$ = Charge density ### 2.2 Constitutive Relations For linear, isotropic media: $$ \mathbf{D} = \varepsilon_0 \varepsilon_r \mathbf{E} = \varepsilon_0 (1 + \chi_e) \mathbf{E} $$ $$ \mathbf{B} = \mu_0 \mu_r \mathbf{H} $$ The complex dielectric function: $$ \tilde{\varepsilon}(\omega) = \varepsilon_1(\omega) + i\varepsilon_2(\omega) = \tilde{n}^2 = (n + ik)^2 $$ Where: - $n$ = Refractive index - $k$ = Extinction coefficient ### 2.3 Fresnel Equations At an interface between media with refractive indices $\tilde{n}_1$ and $\tilde{n}_2$: **s-polarization (TE):** $$ r_s = \frac{n_1 \cos\theta_i - n_2 \cos\theta_t}{n_1 \cos\theta_i + n_2 \cos\theta_t} $$ $$ t_s = \frac{2 n_1 \cos\theta_i}{n_1 \cos\theta_i + n_2 \cos\theta_t} $$ **p-polarization (TM):** $$ r_p = \frac{n_2 \cos\theta_i - n_1 \cos\theta_t}{n_2 \cos\theta_i + n_1 \cos\theta_t} $$ $$ t_p = \frac{2 n_1 \cos\theta_i}{n_2 \cos\theta_i + n_1 \cos\theta_t} $$ With Snell's law: $$ n_1 \sin\theta_i = n_2 \sin\theta_t $$ ## 3. Mathematics of Inverse Problems ### 3.1 Problem Formulation Metrology is fundamentally an **inverse problem**: | Problem Type | Description | Well-Posed? | |--------------|-------------|-------------| | **Forward** | Structure parameters → Measured signal | Yes | | **Inverse** | Measured signal → Structure parameters | Often No | We seek parameters $\mathbf{p}$ that minimize the difference between model $M(\mathbf{p})$ and data $\mathbf{D}$: $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 $$ Or with weighted least squares: $$ \chi^2 = \sum_{k=1}^{N} \frac{\left( M_k(\mathbf{p}) - D_k \right)^2}{\sigma_k^2} $$ ### 3.2 Levenberg-Marquardt Algorithm The workhorse optimization algorithm interpolates between gradient descent and Gauss-Newton: $$ \left( \mathbf{J}^T \mathbf{J} + \lambda \mathbf{I} \right) \delta\mathbf{p} = \mathbf{J}^T \left( \mathbf{D} - M(\mathbf{p}) \right) $$ Where: - $\mathbf{J}$ = Jacobian matrix (sensitivity matrix) - $\lambda$ = Damping parameter - $\delta\mathbf{p}$ = Parameter update step The Jacobian elements: $$ J_{ij} = \frac{\partial M_i}{\partial p_j} $$ **Algorithm behavior:** - Large $\lambda$ → Gradient descent (robust, slow) - Small $\lambda$ → Gauss-Newton (fast near minimum) ### 3.3 Regularization Techniques For ill-posed problems, regularization is essential: **Tikhonov Regularization (L2):** $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 + \alpha \left\| \mathbf{p} - \mathbf{p}_0 \right\|^2 $$ **LASSO Regularization (L1):** $$ \min_{\mathbf{p}} \left\| M(\mathbf{p}) - \mathbf{D} \right\|^2 + \alpha \left\| \mathbf{p} \right\|_1 $$ **Bayesian Inference:** $$ P(\mathbf{p} | \mathbf{D}) = \frac{P(\mathbf{D} | \mathbf{p}) \cdot P(\mathbf{p})}{P(\mathbf{D})} $$ Where: - $P(\mathbf{p} | \mathbf{D})$ = Posterior probability - $P(\mathbf{D} | \mathbf{p})$ = Likelihood - $P(\mathbf{p})$ = Prior probability ## 4. Thin Film Optics ### 4.1 Ellipsometry Fundamentals Ellipsometry measures the change in polarization state upon reflection: $$ \rho = \tan(\Psi) \cdot e^{i\Delta} = \frac{r_p}{r_s} $$ Where: - $\Psi$ = Amplitude ratio angle - $\Delta$ = Phase difference - $r_p, r_s$ = Complex reflection coefficients ### 4.2 Transfer Matrix Method For multilayer stacks, the characteristic matrix for layer $j$: $$ \mathbf{M}_j = \begin{pmatrix} \cos\delta_j & \frac{i \sin\delta_j}{\eta_j} \\ i\eta_j \sin\delta_j & \cos\delta_j \end{pmatrix} $$ Where the phase thickness: $$ \delta_j = \frac{2\pi}{\lambda} \tilde{n}_j d_j \cos\theta_j $$ And the optical admittance: $$ \eta_j = \begin{cases} \tilde{n}_j \cos\theta_j & \text{(s-pol)} \\ \frac{\tilde{n}_j}{\cos\theta_j} & \text{(p-pol)} \end{cases} $$ **Total system matrix:** $$ \mathbf{M}_{total} = \mathbf{M}_1 \cdot \mathbf{M}_2 \cdot \ldots \cdot \mathbf{M}_N = \begin{pmatrix} m_{11} & m_{12} \\ m_{21} & m_{22} \end{pmatrix} $$ **Reflection coefficient:** $$ r = \frac{\eta_0 m_{11} + \eta_0 \eta_s m_{12} - m_{21} - \eta_s m_{22}}{\eta_0 m_{11} + \eta_0 \eta_s m_{12} + m_{21} + \eta_s m_{22}} $$ ### 4.3 Dispersion Models **Lorentz Oscillator Model:** $$ \varepsilon(\omega) = \varepsilon_\infty + \sum_j \frac{A_j}{\omega_j^2 - \omega^2 - i\gamma_j \omega} $$ **Tauc-Lorentz Model (for amorphous semiconductors):** $$ \varepsilon_2(E) = \begin{cases} \frac{A E_0 C (E - E_g)^2}{(E^2 - E_0^2)^2 + C^2 E^2} \cdot \frac{1}{E} & E > E_g \\ 0 & E \leq E_g \end{cases} $$ With $\varepsilon_1$ obtained via Kramers-Kronig relations: $$ \varepsilon_1(E) = \varepsilon_{1,\infty} + \frac{2}{\pi} \mathcal{P} \int_{E_g}^{\infty} \frac{\xi \varepsilon_2(\xi)}{\xi^2 - E^2} d\xi $$ ## 5. Scatterometry and RCWA ### 5.1 Rigorous Coupled-Wave Analysis For a grating with period $\Lambda$, electromagnetic fields are expanded in Fourier orders: $$ E(x,z) = \sum_{m=-M}^{M} E_m(z) \exp(i k_{xm} x) $$ Where the diffracted wave vectors: $$ k_{xm} = k_{x0} + \frac{2\pi m}{\Lambda} = k_0 \left( n_1 \sin\theta_i + \frac{m\lambda}{\Lambda} \right) $$ ### 5.2 Eigenvalue Problem In each layer, the field satisfies: $$ \frac{d^2 \mathbf{E}}{dz^2} = \mathbf{\Omega}^2 \mathbf{E} $$ Where $\mathbf{\Omega}^2$ is a matrix determined by the Fourier components of the permittivity: $$ \varepsilon(x) = \sum_n \varepsilon_n \exp\left( i \frac{2\pi n}{\Lambda} x \right) $$ The eigenvalue decomposition: $$ \mathbf{\Omega}^2 = \mathbf{W} \mathbf{\Lambda} \mathbf{W}^{-1} $$ Provides propagation constants (eigenvalues $\lambda_m$) and field profiles (eigenvectors in $\mathbf{W}$). ### 5.3 S-Matrix Formulation For numerical stability, use the scattering matrix formulation: $$ \begin{pmatrix} \mathbf{a}_1^- \\ \mathbf{a}_N^+ \end{pmatrix} = \mathbf{S} \begin{pmatrix} \mathbf{a}_1^+ \\ \mathbf{a}_N^- \end{pmatrix} $$ Where $\mathbf{a}^+$ and $\mathbf{a}^-$ represent forward and backward propagating waves. The S-matrix is built recursively: $$ \mathbf{S}_{1 \to j+1} = \mathbf{S}_{1 \to j} \star \mathbf{S}_{j,j+1} $$ Using the Redheffer star product $\star$. ## 6. Statistical Process Control ### 6.1 Control Charts **$\bar{X}$ Chart (Mean):** $$ UCL = \bar{\bar{X}} + A_2 \bar{R} $$ $$ LCL = \bar{\bar{X}} - A_2 \bar{R} $$ **R Chart (Range):** $$ UCL_R = D_4 \bar{R} $$ $$ LCL_R = D_3 \bar{R} $$ **EWMA (Exponentially Weighted Moving Average):** $$ Z_t = \lambda X_t + (1 - \lambda) Z_{t-1} $$ With control limits: $$ UCL = \mu_0 + L \sigma \sqrt{\frac{\lambda}{2 - \lambda} \left[ 1 - (1-\lambda)^{2t} \right]} $$ ### 6.2 Process Capability Indices **$C_p$ (Process Capability):** $$ C_p = \frac{USL - LSL}{6\sigma} $$ **$C_{pk}$ (Centered Process Capability):** $$ C_{pk} = \min \left( \frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma} \right) $$ **$C_{pm}$ (Taguchi Capability):** $$ C_{pm} = \frac{USL - LSL}{6\sqrt{\sigma^2 + (\mu - T)^2}} $$ Where: - $USL$ = Upper Specification Limit - $LSL$ = Lower Specification Limit - $T$ = Target value - $\mu$ = Process mean - $\sigma$ = Process standard deviation ### 6.3 Gauge R&R Analysis Total measurement variance decomposition: $$ \sigma^2_{total} = \sigma^2_{part} + \sigma^2_{gauge} $$ $$ \sigma^2_{gauge} = \sigma^2_{repeatability} + \sigma^2_{reproducibility} $$ **Precision-to-Tolerance Ratio:** $$ P/T = \frac{6 \sigma_{gauge}}{USL - LSL} \times 100\% $$ | P/T Ratio | Assessment | |-----------|------------| | < 10% | Excellent | | 10-30% | Acceptable | | > 30% | Unacceptable | ## 7. Uncertainty Quantification ### 7.1 Fisher Information Matrix The Fisher Information Matrix for parameter estimation: $$ F_{ij} = \sum_{k=1}^{N} \frac{1}{\sigma_k^2} \frac{\partial M_k}{\partial p_i} \frac{\partial M_k}{\partial p_j} $$ Or equivalently: $$ F_{ij} = -E \left[ \frac{\partial^2 \ln L}{\partial p_i \partial p_j} \right] $$ Where $L$ is the likelihood function. ### 7.2 Cramér-Rao Lower Bound The covariance matrix of any unbiased estimator is bounded: $$ \text{Cov}(\hat{\mathbf{p}}) \geq \mathbf{F}^{-1} $$ For a single parameter: $$ \text{Var}(\hat{\theta}) \geq \frac{1}{I(\theta)} $$ **Interpretation:** - Diagonal elements of $\mathbf{F}^{-1}$ give minimum variance for each parameter - Off-diagonal elements indicate parameter correlations - Large condition number of $\mathbf{F}$ indicates ill-conditioning ### 7.3 Correlation Coefficient $$ \rho_{ij} = \frac{F^{-1}_{ij}}{\sqrt{F^{-1}_{ii} F^{-1}_{jj}}} $$ | |$\rho$| | Interpretation | |--------|----------------| | < 0.3 | Weak correlation | | 0.3 – 0.7 | Moderate correlation | | > 0.7 | Strong correlation | | > 0.95 | Severe: consider fixing one parameter | ### 7.4 GUM Framework According to the Guide to the Expression of Uncertainty in Measurement: **Combined standard uncertainty:** $$ u_c^2(y) = \sum_{i=1}^{N} \left( \frac{\partial f}{\partial x_i} \right)^2 u^2(x_i) + 2 \sum_{i=1}^{N-1} \sum_{j=i+1}^{N} \frac{\partial f}{\partial x_i} \frac{\partial f}{\partial x_j} u(x_i, x_j) $$ **Expanded uncertainty:** $$ U = k \cdot u_c(y) $$ Where $k$ is the coverage factor (typically $k=2$ for 95% confidence). ## 8. Machine Learning in Metrology ### 8.1 Neural Network Surrogate Models Replace expensive physics simulations with trained neural networks: $$ M_{NN}(\mathbf{p}; \mathbf{W}) \approx M_{physics}(\mathbf{p}) $$ **Training objective:** $$ \mathcal{L} = \frac{1}{N} \sum_{i=1}^{N} \left\| M_{NN}(\mathbf{p}_i) - M_{physics}(\mathbf{p}_i) \right\|^2 + \lambda \left\| \mathbf{W} \right\|^2 $$ **Speedup:** Typically $10^4$ – $10^6 \times$ faster than RCWA/FEM. ### 8.2 Physics-Informed Neural Networks (PINNs) Incorporate physical laws into the loss function: $$ \mathcal{L}_{total} = \mathcal{L}_{data} + \lambda_{physics} \mathcal{L}_{physics} $$ Where: $$ \mathcal{L}_{physics} = \left\| \nabla \times \mathbf{E} + \frac{\partial \mathbf{B}}{\partial t} \right\|^2 + \ldots $$ ### 8.3 Gaussian Process Regression A non-parametric Bayesian approach: $$ f(\mathbf{x}) \sim \mathcal{GP}\left( m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}') \right) $$ **Common kernel (RBF/Squared Exponential):** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left( -\frac{\left\| \mathbf{x} - \mathbf{x}' \right\|^2}{2\ell^2} \right) $$ **Posterior prediction:** $$ \mu_* = \mathbf{k}_*^T (\mathbf{K} + \sigma_n^2 \mathbf{I})^{-1} \mathbf{y} $$ $$ \sigma_*^2 = k_{**} - \mathbf{k}_*^T (\mathbf{K} + \sigma_n^2 \mathbf{I})^{-1} \mathbf{k}_* $$ **Advantages:** - Provides uncertainty estimates naturally - Works well with limited training data - Interpretable hyperparameters ### 8.4 Virtual Metrology Predict wafer properties from equipment sensor data: $$ \hat{y} = f(FDC_1, FDC_2, \ldots, FDC_n) $$ Where $FDC_i$ are Fault Detection and Classification sensor readings. **Common approaches:** - Partial Least Squares (PLS) regression - Random Forests - Gradient Boosting (XGBoost, LightGBM) - Deep neural networks ## 9. Advanced Topics and Frontiers ### 9.1 3D Metrology Challenges Modern structures require 3D measurement: | Structure | Complexity | Key Challenge | |-----------|------------|---------------| | FinFET | Moderate | Fin height, sidewall angle | | GAA/Nanosheet | High | Sheet thickness, spacing | | 3D NAND | Very High | 200+ layers, bowing, tilt | | DRAM HAR | Extreme | 100:1 aspect ratio structures | ### 9.2 Hybrid Metrology Combining multiple techniques to break parameter correlations: $$ \chi^2_{total} = \sum_{techniques} w_t \chi^2_t $$ **Example combination:** - OCD for periodic structure parameters - Ellipsometry for film optical constants - XRR for density and interface roughness **Mathematical framework:** $$ \mathbf{F}_{hybrid} = \sum_t \mathbf{F}_t $$ Reduces off-diagonal elements, improving condition number. ### 9.3 Atomic-Scale Considerations At the 2nm node and beyond: **Line Edge Roughness (LER):** $$ \sigma_{LER} = \sqrt{\frac{1}{L} \int_0^L \left[ x(z) - \bar{x} \right]^2 dz} $$ **Power Spectral Density:** $$ PSD(f) = \frac{\sigma^2 \xi}{1 + (2\pi f \xi)^{2(1+H)}} $$ Where: - $\xi$ = Correlation length - $H$ = Hurst exponent (roughness character) **Quantum Effects:** - Tunneling through thin barriers - Discrete dopant effects - Wave function penetration ### 9.4 Model-Measurement Circularity A fundamental epistemological challenge: ``` - ┌──────────────┐ ┌──────────────┐ │ Physical │ ───► │ Measured │ │ Structure │ │ Signal │ └──────────────┘ └──────────────┘ ▲ │ │ ▼ │ ┌──────────────┐ │ │ Model │ └────────────◄─┤ Inversion │ └──────────────┘ ``` **Key questions:** - How do we validate models when "truth" requires modeling? - Reference metrology (TEM) also requires interpretation - What does it mean to "know" a dimension at atomic scale? ## Key Symbols and Notation | Symbol | Description | Units | |--------|-------------|-------| | $\lambda$ | Wavelength | nm | | $\theta$ | Angle of incidence | degrees | | $n$ | Refractive index | dimensionless | | $k$ | Extinction coefficient | dimensionless | | $d$ | Film thickness | nm | | $\Lambda$ | Grating period | nm | | $\Psi, \Delta$ | Ellipsometric angles | degrees | | $\sigma$ | Standard deviation | varies | | $\mathbf{J}$ | Jacobian matrix | varies | | $\mathbf{F}$ | Fisher Information Matrix | varies | ## Computational Complexity | Method | Complexity | Typical Time | |--------|------------|--------------| | Transfer Matrix | $O(N)$ | $\mu$s | | RCWA | $O(M^3 \cdot L)$ | ms – s | | FEM | $O(N^{1.5})$ | s – min | | FDTD | $O(N \cdot T)$ | s – min | | Monte Carlo (SEM) | $O(N_{electrons})$ | min – hr | | Neural Network (inference) | $O(1)$ | $\mu$s | Where: - $N$ = Number of layers / mesh elements - $M$ = Number of Fourier orders - $L$ = Number of layers - $T$ = Number of time steps

mewma, mewma, spc

Multivariate moving average chart.

micro bga, packaging

Very fine pitch BGA.

micro search space, neural architecture search

Micro search spaces focus on small components like operations within cells enabling efficient architecture optimization.

micro-batch, distributed training

Small batch processed at once.

micro-break,lithography

Small breaks in intended continuous features.

micro-bridging,lithography

Small unwanted connections between features.

micro-bump, business & strategy

Micro-bumps are fine-pitch solder connections between dies in 3D stacks.

micro-bumps, advanced packaging

Small solder bumps for 3D interconnect.

micro-ct, failure analysis advanced

Micro-computed tomography creates 3D reconstructions of package internals with micron-scale resolution.

micro-pl, metrology

PL with microscale resolution.

micro-xrf, metrology

High spatial resolution XRF.

microaggression detection,nlp

Identify subtle discriminatory language.

microchannel cooling, thermal

Fluid cooling through microchannels.

microchannel cooling, thermal management

Microchannel cooling uses narrow parallel channels etched in substrates for high heat flux removal through forced liquid convection.

micrograd,tiny,andrej karpathy

micrograd is tiny autograd engine by Karpathy. Educational. Scalar-level backprop.

microloading,etch

Etch rate depends on local pattern density.

micrometer,metrology

Precision length measurement tool.

micronet challenge, edge ai

Competition for efficient models.

microprobing,testing

Probe internal nodes of die for electrical debug.

microroughness, metrology

Surface roughness at micron scale.

microservices architecture,software engineering

Decompose system into independent services.

microwave impedance microscopy, metrology

Image electrical properties at nanoscale.

microwave photoconductivity decay, metrology

Non-contact lifetime measurement.

mid-gap work function,device physics

Work function near silicon bandgap center.

middle man, code ai

Class delegating everything.

middle-of-line process development, mol, process integration

Develop MOL integration scheme.

midjourney, multimodal ai

Midjourney generates artistic images from text prompts using proprietary diffusion-based models.

migration,upgrade,language

AI assists language/framework migration. Convert Python 2 to 3, etc.

mil-hdbk-217, business & standards

MIL-HDBK-217 provides models for electronic equipment reliability prediction.

milestone, quality & reliability

Milestones mark significant project events or deliverables.

milk run, supply chain & logistics

Milk run logistics uses regular routes collecting materials from multiple suppliers reducing transportation costs.

miller indices, material science

Notation for crystal planes.

millisecond anneal,diffusion

Ultra-fast anneal using lasers or flash lamps.

milvus,vector db

Open-source vector database for similarity search.

milvus,vector,distributed

Milvus is distributed vector database. Large scale.