← Back to AI Factory Chat

AI Factory Glossary

288 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 3 of 6 (288 entries)

metaformer,llm architecture

Abstract framework for transformer-like architectures.

metainit, meta-learning

Learn good initialization via meta-learning.

metal deposition,pvd,cvd,ald,sputtering,electroplating,film growth,copper plating,butler-volmer,nernst-planck,monte carlo,deposition modeling

# Mathematical Modeling of Metal Deposition in Semiconductor Manufacturing ## 1. Overview: Metal Deposition Processes Metal deposition is a critical step in semiconductor fabrication, creating interconnects, contacts, barrier layers, and various metallic structures. The primary deposition methods require distinct mathematical treatments: | Process | Physics Domain | Key Mathematics | |---------|----------------|-----------------| | **PVD (Sputtering)** | Ballistic transport, plasma physics | Boltzmann transport, Monte Carlo | | **CVD/PECVD** | Gas-phase transport, surface reactions | Navier-Stokes, reaction-diffusion | | **ALD** | Self-limiting surface chemistry | Site-balance kinetics | | **Electroplating (ECD)** | Electrochemistry, mass transport | Butler-Volmer, Nernst-Planck | ## 2. Transport Phenomena Models ### 2.1 Gas-Phase Transport (CVD/PECVD) The precursor concentration field follows the **convection-diffusion-reaction equation**: $$ \frac{\partial C}{\partial t} + \mathbf{v} \cdot \nabla C = D \nabla^2 C + R_{gas} $$ Where: - $C$ — precursor concentration (mol/m³) - $\mathbf{v}$ — velocity field vector (m/s) - $D$ — diffusion coefficient (m²/s) - $R_{gas}$ — gas-phase reaction source term (mol/m³·s) ### 2.2 Flow Field Equations The **incompressible Navier-Stokes equations** govern the velocity field: $$ \rho \left( \frac{\partial \mathbf{v}}{\partial t} + \mathbf{v} \cdot \nabla \mathbf{v} \right) = -\nabla p + \mu \nabla^2 \mathbf{v} $$ With continuity equation: $$ \nabla \cdot \mathbf{v} = 0 $$ Where: - $\rho$ — gas density (kg/m³) - $p$ — pressure (Pa) - $\mu$ — dynamic viscosity (Pa·s) ### 2.3 Knudsen Number and Transport Regimes At low pressures, the **Knudsen number** determines the transport regime: $$ Kn = \frac{\lambda}{L} = \frac{k_B T}{\sqrt{2} \pi d^2 p L} $$ Where: - $\lambda$ — mean free path (m) - $L$ — characteristic length (m) - $k_B$ — Boltzmann constant ($1.38 \times 10^{-23}$ J/K) - $T$ — temperature (K) - $d$ — molecular diameter (m) - $p$ — pressure (Pa) **Transport regime classification:** - $Kn < 0.01$ — **Continuum regime** → Navier-Stokes CFD - $0.01 < Kn < 0.1$ — **Slip flow regime** → Modified NS with slip boundary conditions - $0.1 < Kn < 10$ — **Transitional regime** → DSMC, Boltzmann equation - $Kn > 10$ — **Free molecular regime** → Ballistic/Monte Carlo methods ## 3. Surface Reaction Kinetics ### 3.1 Langmuir-Hinshelwood Mechanism For bimolecular surface reactions (common in CVD): $$ r = \frac{k \cdot K_A K_B \cdot p_A p_B}{(1 + K_A p_A + K_B p_B)^2} $$ Where: - $r$ — reaction rate (mol/m²·s) - $k$ — surface reaction rate constant (mol/m²·s) - $K_A, K_B$ — adsorption equilibrium constants (Pa⁻¹) - $p_A, p_B$ — partial pressures of reactants A and B (Pa) ### 3.2 Sticking Coefficient Model The probability that an impinging molecule adsorbs on the surface: $$ S = S_0 \exp\left( -\frac{E_a}{k_B T} \right) \cdot f(\theta) $$ Where: - $S$ — sticking coefficient (dimensionless) - $S_0$ — pre-exponential sticking factor - $E_a$ — activation energy (J) - $f(\theta) = (1 - \theta)^n$ — site blocking function - $\theta$ — surface coverage (dimensionless, 0 to 1) - $n$ — order of site blocking ### 3.3 Arrhenius Temperature Dependence $$ k(T) = A \exp\left( -\frac{E_a}{RT} \right) $$ Where: - $A$ — pre-exponential factor (frequency factor) - $E_a$ — activation energy (J/mol) - $R$ — universal gas constant (8.314 J/mol·K) - $T$ — absolute temperature (K) ## 4. Film Growth Models ### 4.1 Continuum Surface Evolution #### Edwards-Wilkinson Equation (Linear Growth) $$ \frac{\partial h}{\partial t} = \nu \nabla^2 h + F + \eta(\mathbf{x}, t) $$ #### Kardar-Parisi-Zhang (KPZ) Equation (Nonlinear Growth) $$ \frac{\partial h}{\partial t} = \nu \nabla^2 h + \frac{\lambda}{2} |\nabla h|^2 + F + \eta $$ Where: - $h(\mathbf{x}, t)$ — surface height at position $\mathbf{x}$ and time $t$ - $\nu$ — surface diffusion coefficient (m²/s) - $\lambda$ — nonlinear growth parameter - $F$ — mean deposition flux (m/s) - $\eta$ — stochastic noise term (Gaussian white noise) ### 4.2 Scaling Relations Surface roughness evolves according to: $$ W(L, t) = L^\alpha f\left( \frac{t}{L^z} \right) $$ Where: - $W$ — interface width (roughness) - $L$ — system size - $\alpha$ — roughness exponent - $z$ — dynamic exponent - $f$ — scaling function ## 5. Step Coverage and Conformality ### 5.1 Thiele Modulus For high-aspect-ratio features, the **Thiele modulus** determines conformality: $$ \phi = L \sqrt{\frac{k_s}{D_{eff}}} $$ Where: - $\phi$ — Thiele modulus (dimensionless) - $L$ — feature depth (m) - $k_s$ — surface reaction rate constant (m/s) - $D_{eff}$ — effective diffusivity (m²/s) **Step coverage regimes:** - $\phi \ll 1$ — **Reaction-limited** → Excellent conformality - $\phi \gg 1$ — **Transport-limited** → Poor step coverage (bread-loafing) ### 5.2 Knudsen Diffusion in Trenches $$ D_K = \frac{w}{3} \sqrt{\frac{8 R T}{\pi M}} $$ Where: - $D_K$ — Knudsen diffusion coefficient (m²/s) - $w$ — trench width (m) - $R$ — universal gas constant (J/mol·K) - $T$ — temperature (K) - $M$ — molecular weight (kg/mol) ### 5.3 Feature-Scale Concentration Profile Solving for concentration in a trench with reactive walls: $$ D_{eff} \frac{d^2 C}{dy^2} = \frac{2 k_s C}{w} $$ General solution: $$ C(y) = C_0 \frac{\cosh\left( \phi \frac{L - y}{L} \right)}{\cosh(\phi)} $$ ## 6. Atomic Layer Deposition (ALD) Models ### 6.1 Self-Limiting Surface Kinetics Surface site balance equation: $$ \frac{d\theta}{dt} = k_a C (1 - \theta) - k_d \theta $$ Where: - $\theta$ — fractional surface coverage - $k_a$ — adsorption rate constant (m³/mol·s) - $k_d$ — desorption rate constant (s⁻¹) - $C$ — gas-phase precursor concentration (mol/m³) At equilibrium saturation: $$ \theta_{eq} = \frac{k_a C}{k_a C + k_d} \approx 1 \quad \text{(for strong chemisorption)} $$ ### 6.2 Growth Per Cycle (GPC) $$ \text{GPC} = \Gamma_0 \cdot \Omega \cdot \eta $$ Where: - $\Gamma_0$ — surface site density (sites/m²) - $\Omega$ — volume per deposited atom (m³) - $\eta$ — reaction efficiency (dimensionless) ### 6.3 Saturation Dose-Time Relationship $$ \theta(t) = 1 - \exp\left( -\frac{S \cdot \Phi \cdot t}{\Gamma_0} \right) $$ **Impingement flux** from kinetic theory: $$ \Phi = \frac{p}{\sqrt{2 \pi m k_B T}} $$ Where: - $\Phi$ — molecular impingement flux (molecules/m²·s) - $p$ — precursor partial pressure (Pa) - $m$ — molecular mass (kg) ## 7. Plasma Modeling (PVD/PECVD) ### 7.1 Plasma Sheath Physics **Child-Langmuir law** for ion current density: $$ J_{ion} = \frac{4 \varepsilon_0}{9} \sqrt{\frac{2e}{M_i}} \frac{V_s^{3/2}}{d_s^2} $$ Where: - $J_{ion}$ — ion current density (A/m²) - $\varepsilon_0$ — vacuum permittivity ($8.85 \times 10^{-12}$ F/m) - $e$ — elementary charge ($1.6 \times 10^{-19}$ C) - $M_i$ — ion mass (kg) - $V_s$ — sheath voltage (V) - $d_s$ — sheath thickness (m) ### 7.2 Ion Energy at Substrate $$ \varepsilon_{ion} \approx e V_s + \frac{1}{2} M_i v_{Bohm}^2 $$ **Bohm velocity:** $$ v_{Bohm} = \sqrt{\frac{k_B T_e}{M_i}} $$ Where: - $T_e$ — electron temperature (K or eV) ### 7.3 Sputtering Yield (Sigmund Formula) $$ Y(E) = \frac{3 \alpha}{4 \pi^2} \cdot \frac{4 M_1 M_2}{(M_1 + M_2)^2} \cdot \frac{E}{U_0} $$ Where: - $Y$ — sputtering yield (atoms/ion) - $\alpha$ — dimensionless factor (~0.2–0.4) - $M_1$ — incident ion mass - $M_2$ — target atom mass - $E$ — incident ion energy (eV) - $U_0$ — surface binding energy (eV) ### 7.4 Electron Energy Distribution Function (EEDF) The Boltzmann equation in energy space: $$ \frac{\partial f}{\partial t} + \mathbf{v} \cdot \nabla f + \frac{e \mathbf{E}}{m_e} \cdot \nabla_v f = C[f] $$ Where: - $f$ — electron energy distribution function - $\mathbf{E}$ — electric field - $m_e$ — electron mass - $C[f]$ — collision integral ## 8. MDP: Markov Decision Process for Process Control ### 8.1 MDP Formulation A Markov Decision Process is defined by the tuple: $$ \mathcal{M} = (S, A, P, R, \gamma) $$ **Components in semiconductor context:** - **State space $S$**: Film thickness, resistivity, uniformity, equipment state, wafer position - **Action space $A$**: Temperature, pressure, flow rates, RF power, deposition time - **Transition probability $P(s' | s, a)$**: Stochastic process model - **Reward function $R(s, a)$**: Yield, uniformity, throughput, quality metrics - **Discount factor $\gamma$**: Time preference (typically 0.9–0.99) ### 8.2 Bellman Optimality Equation $$ V^*(s) = \max_{a \in A} \left[ R(s, a) + \gamma \sum_{s'} P(s' | s, a) V^*(s') \right] $$ **Q-function formulation:** $$ Q^*(s, a) = R(s, a) + \gamma \sum_{s'} P(s' | s, a) \max_{a'} Q^*(s', a') $$ ### 8.3 Run-to-Run (R2R) Control Optimal recipe adjustment after each wafer: $$ \mathbf{u}_{k+1} = \mathbf{u}_k + \mathbf{K} (\mathbf{y}_{target} - \mathbf{y}_k) $$ Where: - $\mathbf{u}_k$ — process recipe parameters at run $k$ - $\mathbf{y}_k$ — measured output at run $k$ - $\mathbf{K}$ — controller gain matrix (from MDP policy optimization) ### 8.4 Reinforcement Learning Approaches | Method | Application | Characteristics | |--------|-------------|-----------------| | **Q-Learning** | Discrete parameter optimization | Model-free, tabular | | **Deep Q-Network (DQN)** | High-dimensional state spaces | Neural network approximation | | **Policy Gradient** | Continuous process control | Direct policy optimization | | **Actor-Critic (A2C/PPO)** | Complex control tasks | Combined value and policy | | **Model-Based RL** | Physics-informed control | Sample efficient | ## 9. Electrochemical Deposition (Copper Damascene) ### 9.1 Butler-Volmer Equation $$ i = i_0 \left[ \exp\left( \frac{\alpha_a F \eta}{RT} \right) - \exp\left( -\frac{\alpha_c F \eta}{RT} \right) \right] $$ Where: - $i$ — current density (A/m²) - $i_0$ — exchange current density (A/m²) - $\alpha_a, \alpha_c$ — anodic and cathodic transfer coefficients - $F$ — Faraday constant (96,485 C/mol) - $\eta = E - E_{eq}$ — overpotential (V) - $R$ — gas constant (J/mol·K) - $T$ — temperature (K) ### 9.2 Mass Transport Limited Current $$ i_L = \frac{n F D C_b}{\delta} $$ Where: - $i_L$ — limiting current density (A/m²) - $n$ — number of electrons transferred - $D$ — diffusion coefficient of Cu²⁺ (m²/s) - $C_b$ — bulk concentration (mol/m³) - $\delta$ — diffusion layer thickness (m) ### 9.3 Nernst-Planck Equation $$ \mathbf{J}_i = -D_i \nabla C_i - \frac{z_i F D_i}{RT} C_i \nabla \phi + C_i \mathbf{v} $$ Where: - $\mathbf{J}_i$ — flux of species $i$ - $z_i$ — charge number - $\phi$ — electric potential ### 9.4 Superfilling (Bottom-Up Fill) The curvature-enhanced accelerator mechanism: $$ v_n = v_0 (1 + \kappa \cdot \Gamma_{acc}) $$ Where: - $v_n$ — local growth velocity normal to surface - $v_0$ — baseline growth velocity - $\kappa$ — local surface curvature (1/m) - $\Gamma_{acc}$ — accelerator surface concentration ## 10. Multiscale Modeling Framework ### 10.1 Hierarchical Scale Integration ``` ┌──────────────────────────────────────────────────────────────┐ │ REACTOR SCALE │ │ CFD: Flow, temperature, concentration │ │ Time: seconds | Length: cm │ └─────────────────────────┬────────────────────────────────────┘ │ Boundary fluxes ▼ ┌──────────────────────────────────────────────────────────────┐ │ FEATURE SCALE │ │ Level-set / String method for surface evolution │ │ Time: seconds | Length: μm │ └─────────────────────────┬────────────────────────────────────┘ │ Local rates ▼ ┌──────────────────────────────────────────────────────────────┐ │ MESOSCALE (kMC) │ │ Kinetic Monte Carlo: nucleation, island growth │ │ Time: ms | Length: nm │ └─────────────────────────┬────────────────────────────────────┘ │ Rate parameters ▼ ┌──────────────────────────────────────────────────────────────┐ │ ATOMISTIC (MD/DFT) │ │ Molecular dynamics, ab initio: binding energies, │ │ diffusion barriers, reaction paths │ │ Time: ps | Length: Å │ └──────────────────────────────────────────────────────────────┘ ``` ### 10.2 Kinetic Monte Carlo (kMC) Event rate from transition state theory: $$ k_i = \nu_0 \exp\left( -\frac{E_{a,i}}{k_B T} \right) $$ Total rate and time step: $$ k_{total} = \sum_i k_i, \quad \Delta t = -\frac{\ln(r)}{k_{total}} $$ Where $r \in (0, 1]$ is a uniform random number. ### 10.3 Molecular Dynamics Newton's equations of motion: $$ m_i \frac{d^2 \mathbf{r}_i}{dt^2} = -\nabla_i U(\mathbf{r}_1, \mathbf{r}_2, \ldots, \mathbf{r}_N) $$ **Lennard-Jones potential:** $$ U_{LJ}(r) = 4\varepsilon \left[ \left( \frac{\sigma}{r} \right)^{12} - \left( \frac{\sigma}{r} \right)^6 \right] $$ **Embedded Atom Method (EAM) for metals:** $$ U = \sum_i F_i(\rho_i) + \frac{1}{2} \sum_{i \neq j} \phi_{ij}(r_{ij}) $$ Where $\rho_i = \sum_{j \neq i} f_j(r_{ij})$ is the electron density at atom $i$. ## 11. Uniformity Modeling ### 11.1 Wafer-Scale Thickness Distribution (Sputtering) For a circular magnetron target: $$ t(r) = \int_{target} \frac{Y \cdot J_{ion} \cdot \cos\theta_t \cdot \cos\theta_w}{\pi R^2} \, dA $$ Where: - $t(r)$ — thickness at radial position $r$ - $\theta_t$ — emission angle from target - $\theta_w$ — incidence angle at wafer ### 11.2 Uniformity Metrics **Within-Wafer Uniformity (WIW):** $$ \sigma_{WIW} = \frac{1}{\bar{t}} \sqrt{\frac{1}{N} \sum_{i=1}^{N} (t_i - \bar{t})^2} \times 100\% $$ **Wafer-to-Wafer Uniformity (WTW):** $$ \sigma_{WTW} = \frac{1}{\bar{t}_{avg}} \sqrt{\frac{1}{M} \sum_{j=1}^{M} (\bar{t}_j - \bar{t}_{avg})^2} \times 100\% $$ **Target specifications:** - $\sigma_{WIW} < 1\%$ for advanced nodes (≤7 nm) - $\sigma_{WTW} < 0.5\%$ for high-volume manufacturing ## 12. Virtual Metrology and Statistical Models ### 12.1 Gaussian Process Regression (GPR) $$ f(\mathbf{x}) \sim \mathcal{GP}(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')) $$ **Squared exponential (RBF) kernel:** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left( -\frac{|\mathbf{x} - \mathbf{x}'|^2}{2\ell^2} \right) $$ **Predictive distribution:** $$ f_* | \mathbf{X}, \mathbf{y}, \mathbf{x}_* \sim \mathcal{N}(\bar{f}_*, \text{var}(f_*)) $$ ### 12.2 Partial Least Squares (PLS) $$ \mathbf{Y} = \mathbf{X} \mathbf{B} + \mathbf{E} $$ Where: - $\mathbf{X}$ — process parameter matrix - $\mathbf{Y}$ — quality outcome matrix - $\mathbf{B}$ — regression coefficient matrix - $\mathbf{E}$ — residual matrix ### 12.3 Principal Component Analysis (PCA) $$ \mathbf{X} = \mathbf{T} \mathbf{P}^T + \mathbf{E} $$ **Hotelling's $T^2$ statistic for fault detection:** $$ T^2 = \sum_{i=1}^{k} \frac{t_i^2}{\lambda_i} $$ ## 13. Process Optimization ### 13.1 Response Surface Methodology (RSM) **Second-order polynomial model:** $$ y = \beta_0 + \sum_{i=1}^{k} \beta_i x_i + \sum_{i=1}^{k} \beta_{ii} x_i^2 + \sum_{i < j} \beta_{ij} x_i x_j + \varepsilon $$ ### 13.2 Constrained Optimization $$ \min_{\mathbf{x}} f(\mathbf{x}) \quad \text{subject to} \quad g_i(\mathbf{x}) \leq 0, \quad h_j(\mathbf{x}) = 0 $$ **Example constraints:** - $g_1$: Non-uniformity ≤ 3% - $g_2$: Resistivity within spec - $g_3$: Throughput ≥ target - $h_1$: Total film thickness = target ### 13.3 Pareto Multi-Objective Optimization $$ \min_{\mathbf{x}} \left[ f_1(\mathbf{x}), f_2(\mathbf{x}), \ldots, f_m(\mathbf{x}) \right] $$ Common trade-offs: - Uniformity vs. throughput - Film quality vs. cost - Conformality vs. deposition rate ## 14. Mathematical Toolkit Reference | Domain | Key Equations | Application | |--------|---------------|-------------| | **Transport** | Navier-Stokes, Convection-Diffusion | Gas flow, precursor delivery | | **Kinetics** | Arrhenius, Langmuir-Hinshelwood | Reaction rates | | **Surface Evolution** | KPZ, Level-set, Edwards-Wilkinson | Film morphology | | **Plasma** | Boltzmann, Child-Langmuir | Ion/electron dynamics | | **Electrochemistry** | Butler-Volmer, Nernst-Planck | Copper plating | | **Control** | Bellman, MDP, RL algorithms | Recipe optimization | | **Statistics** | GPR, PLS, PCA | Virtual metrology | | **Multiscale** | MD, kMC, Continuum | Integrated simulation | ## 15. Physical Constants | Constant | Symbol | Value | Units | |----------|--------|-------|-------| | Boltzmann constant | $k_B$ | $1.38 \times 10^{-23}$ | J/K | | Gas constant | $R$ | $8.314$ | J/(mol·K) | | Faraday constant | $F$ | $96,485$ | C/mol | | Elementary charge | $e$ | $1.60 \times 10^{-19}$ | C | | Vacuum permittivity | $\varepsilon_0$ | $8.85 \times 10^{-12}$ | F/m | | Avogadro's number | $N_A$ | $6.02 \times 10^{23}$ | mol⁻¹ | | Electron mass | $m_e$ | $9.11 \times 10^{-31}$ | kg |

metapath, graph neural networks

Metapaths are composite relations connecting nodes through sequences of edge types in heterogeneous graphs used for similarity and embedding.

metapath2vec, graph neural networks

Heterogeneous graph embeddings.

metapath2vec, graph neural networks

Metapath2vec learns embeddings in heterogeneous graphs through metapath-guided random walks.

metaqnn, neural architecture search

Meta Q-Network applies Q-learning to neural architecture search representing architectures as state sequences for discrete action spaces.

method name prediction, code ai

Suggest method names from implementation.

metrology, scatterometry, ellipsometry, x-ray reflectometry, inverse problems, optimization, statistical inference, mathematical modeling

# Semiconductor Manufacturing Process Metrology: Mathematical Modeling ## 1. The Core Problem Structure Semiconductor metrology faces a fundamental **inverse problem**: we make indirect measurements (optical spectra, scattered X-rays, electron signals) and must infer physical quantities (dimensions, compositions, defect states) that we cannot directly observe at the nanoscale. ### 1.1 Mathematical Formulation The general measurement model: $$ \mathbf{y} = \mathcal{F}(\mathbf{p}) + \boldsymbol{\epsilon} $$ **Variable Definitions:** - $\mathbf{y}$ — measured signal vector (spectrum, image intensity, scattered amplitude) - $\mathbf{p}$ — physical parameters of interest (CD, thickness, sidewall angle, composition) - $\mathcal{F}$ — forward model operator (physics of measurement process) - $\boldsymbol{\epsilon}$ — noise/uncertainty term ### 1.2 Key Mathematical Challenges - **Nonlinearity:** $\mathcal{F}$ is typically highly nonlinear - **Computational cost:** Forward model evaluation is expensive - **Ill-posedness:** Inverse may be non-unique or unstable - **High dimensionality:** Many parameters from limited measurements ## 2. Optical Critical Dimension (OCD) / Scatterometry This is the most mathematically intensive metrology technique in high-volume manufacturing. ### 2.1 Forward Problem: Electromagnetic Scattering For periodic structures (gratings, arrays), solve Maxwell's equations with Floquet-Bloch boundary conditions. #### 2.1.1 Maxwell's Equations $$ \nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t} $$ $$ \nabla \times \mathbf{H} = \mathbf{J} + \frac{\partial \mathbf{D}}{\partial t} $$ #### 2.1.2 Rigorous Coupled Wave Analysis (RCWA) **Field Expansion in Fourier Series:** The electric field in layer $j$ with grating vector $\mathbf{K}$: $$ \mathbf{E}(\mathbf{r}) = \sum_{n=-N}^{N} \mathbf{E}_n^{(j)} \exp\left(i(\mathbf{k}_n \cdot \mathbf{r})\right) $$ where the diffraction wave vectors are: $$ \mathbf{k}_n = \mathbf{k}_0 + n\mathbf{K} $$ **Key Properties:** - Converts PDEs to eigenvalue problem - Matches boundary conditions at layer interfaces - Computational complexity: $O(N^3)$ where $N$ = number of Fourier orders ### 2.2 Inverse Problem: Parameter Extraction Given measured spectra $R(\lambda, \theta)$, find best-fit parameters $\mathbf{p}$. #### 2.2.1 Optimization Formulation $$ \hat{\mathbf{p}} = \arg\min_{\mathbf{p}} \left\| \mathbf{y}_{\text{meas}} - \mathcal{F}(\mathbf{p}) \right\|^2 + \lambda R(\mathbf{p}) $$ **Regularization Options:** - **Tikhonov regularization:** $$ R(\mathbf{p}) = \left\| \mathbf{p} - \mathbf{p}_0 \right\|^2 $$ - **Sparsity-promoting (L1):** $$ R(\mathbf{p}) = \left\| \mathbf{p} \right\|_1 $$ - **Total variation:** $$ R(\mathbf{p}) = \int |\nabla \mathbf{p}| \, d\mathbf{x} $$ #### 2.2.2 Library-Based Approach 1. **Precomputation:** Generate forward model on dense parameter grid 2. **Storage:** Build library with millions of entries 3. **Search:** Find best match using regression methods **Regression Methods:** - Polynomial regression — fast but limited accuracy - Neural networks — handle nonlinearity well - Gaussian process regression — provides uncertainty estimates ### 2.3 Parameter Correlations and Uncertainty #### 2.3.1 Fisher Information Matrix $$ [\mathbf{I}(\mathbf{p})]_{ij} = \mathbb{E}\left[\frac{\partial \ln L}{\partial p_i}\frac{\partial \ln L}{\partial p_j}\right] $$ #### 2.3.2 Cramér-Rao Lower Bound $$ \text{Var}(\hat{p}_i) \geq \left[\mathbf{I}^{-1}\right]_{ii} $$ **Physical Interpretation:** Strong correlations (e.g., height vs. sidewall angle) manifest as near-singular information matrices—a fundamental limit on independent resolution. ## 3. Thin Film Metrology: Ellipsometry ### 3.1 Physical Model Ellipsometry measures polarization state change upon reflection: $$ \rho = \frac{r_p}{r_s} = \tan(\Psi)\exp(i\Delta) $$ **Variables:** - $r_p$ — p-polarized reflection coefficient - $r_s$ — s-polarized reflection coefficient - $\Psi$ — amplitude ratio angle - $\Delta$ — phase difference ### 3.2 Transfer Matrix Formalism For multilayer stacks: $$ \mathbf{M} = \prod_{j=1}^{N} \mathbf{M}_j = \prod_{j=1}^{N} \begin{pmatrix} \cos\delta_j & \dfrac{i\sin\delta_j}{\eta_j} \\[10pt] i\eta_j\sin\delta_j & \cos\delta_j \end{pmatrix} $$ where the phase thickness is: $$ \delta_j = \frac{2\pi}{\lambda} n_j d_j \cos(\theta_j) $$ **Parameters:** - $n_j$ — refractive index of layer $j$ - $d_j$ — thickness of layer $j$ - $\theta_j$ — angle of propagation in layer $j$ - $\eta_j$ — optical admittance ### 3.3 Dispersion Models #### 3.3.1 Cauchy Model (Transparent Materials) $$ n(\lambda) = A + \frac{B}{\lambda^2} + \frac{C}{\lambda^4} $$ #### 3.3.2 Sellmeier Equation $$ n^2(\lambda) = 1 + \sum_{i} \frac{B_i \lambda^2}{\lambda^2 - C_i} $$ #### 3.3.3 Tauc-Lorentz Model (Amorphous Semiconductors) $$ \varepsilon_2(E) = \begin{cases} \dfrac{A E_0 C (E - E_g)^2}{(E^2 - E_0^2)^2 + C^2 E^2} \cdot \dfrac{1}{E} & E > E_g \\[10pt] 0 & E \leq E_g \end{cases} $$ with $\varepsilon_1$ derived via Kramers-Kronig relations: $$ \varepsilon_1(E) = \varepsilon_{1\infty} + \frac{2}{\pi} \mathcal{P} \int_0^\infty \frac{\xi \varepsilon_2(\xi)}{\xi^2 - E^2} d\xi $$ #### 3.3.4 Drude Model (Metals/Conductors) $$ \varepsilon(\omega) = \varepsilon_\infty - \frac{\omega_p^2}{\omega^2 + i\gamma\omega} $$ **Parameters:** - $\omega_p$ — plasma frequency - $\gamma$ — damping coefficient - $\varepsilon_\infty$ — high-frequency dielectric constant ## 4. X-ray Metrology Mathematics ### 4.1 X-ray Reflectivity (XRR) #### 4.1.1 Parratt Recursion Formula For specular reflection at grazing incidence: $$ R_j = \frac{r_{j,j+1} + R_{j+1}\exp(2ik_{z,j+1}d_{j+1})}{1 + r_{j,j+1}R_{j+1}\exp(2ik_{z,j+1}d_{j+1})} $$ where $r_{j,j+1}$ is the Fresnel coefficient at interface $j$. #### 4.1.2 Roughness Correction (Névot-Croce Factor) $$ r'_{j,j+1} = r_{j,j+1} \exp\left(-2k_{z,j}k_{z,j+1}\sigma_j^2\right) $$ **Parameters:** - $k_{z,j}$ — perpendicular wave vector component in layer $j$ - $\sigma_j$ — RMS roughness at interface $j$ ### 4.2 CD-SAXS (Critical Dimension Small Angle X-ray Scattering) #### 4.2.1 Scattering Intensity For transmission scattering from 3D nanostructures: $$ I(\mathbf{q}) = \left|\tilde{\rho}(\mathbf{q})\right|^2 = \left|\int \Delta\rho(\mathbf{r})\exp(-i\mathbf{q}\cdot\mathbf{r})d^3\mathbf{r}\right|^2 $$ #### 4.2.2 Form Factor for Simple Shapes **Rectangular parallelepiped:** $$ F(\mathbf{q}) = V \cdot \text{sinc}\left(\frac{q_x a}{2}\right) \cdot \text{sinc}\left(\frac{q_y b}{2}\right) \cdot \text{sinc}\left(\frac{q_z c}{2}\right) $$ **Cylinder:** $$ F(\mathbf{q}) = 2\pi R^2 L \cdot \frac{J_1(q_\perp R)}{q_\perp R} \cdot \text{sinc}\left(\frac{q_z L}{2}\right) $$ where $J_1$ is the first-order Bessel function. ## 5. Statistical Process Control Mathematics ### 5.1 Virtual Metrology Predict wafer properties from tool sensor data without direct measurement: $$ y = f(\mathbf{x}) + \varepsilon $$ #### 5.1.1 Partial Least Squares (PLS) Handles high-dimensional, correlated inputs: 1. Find latent variables: $\mathbf{T} = \mathbf{X}\mathbf{W}$ 2. Maximize covariance with $y$ 3. Model: $y = \mathbf{T}\mathbf{Q} + e$ **Optimization objective:** $$ \max_{\mathbf{w}} \text{Cov}(\mathbf{X}\mathbf{w}, y)^2 \quad \text{subject to} \quad \|\mathbf{w}\| = 1 $$ #### 5.1.2 Gaussian Process Regression $$ y(\mathbf{x}) \sim \mathcal{GP}\left(m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}')\right) $$ **Common Kernel Functions:** - **Squared Exponential (RBF):** $$ k(\mathbf{x}, \mathbf{x}') = \sigma_f^2 \exp\left(-\frac{\|\mathbf{x} - \mathbf{x}'\|^2}{2\ell^2}\right) $$ - **Matérn 5/2:** $$ k(r) = \sigma_f^2 \left(1 + \frac{\sqrt{5}r}{\ell} + \frac{5r^2}{3\ell^2}\right) \exp\left(-\frac{\sqrt{5}r}{\ell}\right) $$ ### 5.2 Run-to-Run Control #### 5.2.1 EWMA Controller $$ \hat{d}_t = \lambda y_{t-1} + (1-\lambda)\hat{d}_{t-1} $$ $$ x_t = x_{\text{nom}} - \frac{\hat{d}_t}{\hat{\beta}} $$ **Parameters:** - $\lambda$ — smoothing factor (typically 0.2–0.4) - $\hat{\beta}$ — estimated process gain - $x_{\text{nom}}$ — nominal recipe setting #### 5.2.2 Model Predictive Control (MPC) $$ \min_{\mathbf{u}} \sum_{k=0}^{N} \left\| y_{t+k} - y_{\text{target}} \right\|_Q^2 + \left\| \Delta u_{t+k} \right\|_R^2 $$ subject to: - Process dynamics: $\mathbf{x}_{t+1} = \mathbf{A}\mathbf{x}_t + \mathbf{B}\mathbf{u}_t$ - Output equation: $y_t = \mathbf{C}\mathbf{x}_t$ - Constraints: $\mathbf{u}_{\min} \leq \mathbf{u}_t \leq \mathbf{u}_{\max}$ ### 5.3 Wafer-Level Spatial Modeling #### 5.3.1 Zernike Polynomial Decomposition $$ W(r,\theta) = \sum_{n=0}^{N} \sum_{m=-n}^{n} a_{nm} Z_n^m(r,\theta) $$ **First few Zernike polynomials:** | Index | Name | Formula | |-------|------|---------| | $Z_0^0$ | Piston | $1$ | | $Z_1^{-1}$ | Tilt Y | $2r\sin\theta$ | | $Z_1^1$ | Tilt X | $2r\cos\theta$ | | $Z_2^0$ | Defocus | $\sqrt{3}(2r^2-1)$ | | $Z_2^{-2}$ | Astigmatism | $\sqrt{6}r^2\sin2\theta$ | | $Z_2^2$ | Astigmatism | $\sqrt{6}r^2\cos2\theta$ | #### 5.3.2 Gaussian Random Fields For spatially correlated residuals: $$ \text{Cov}\left(W(\mathbf{s}_1), W(\mathbf{s}_2)\right) = \sigma^2 \rho\left(\|\mathbf{s}_1 - \mathbf{s}_2\|; \phi\right) $$ **Common correlation functions:** - **Exponential:** $$ \rho(h) = \exp\left(-\frac{h}{\phi}\right) $$ - **Gaussian:** $$ \rho(h) = \exp\left(-\frac{h^2}{\phi^2}\right) $$ ## 6. Overlay Metrology Mathematics ### 6.1 Higher-Order Correction Models Overlay error as polynomial expansion: $$ \delta x = T_x + M_x \cdot x + R_x \cdot y + \sum_{i+j \leq n} c_{ij}^x x^i y^j $$ $$ \delta y = T_y + M_y \cdot y + R_y \cdot x + \sum_{i+j \leq n} c_{ij}^y x^i y^j $$ **Physical interpretation of linear terms:** - $T_x, T_y$ — Translation - $M_x, M_y$ — Magnification - $R_x, R_y$ — Rotation ### 6.2 Sampling Strategy Optimization #### 6.2.1 D-Optimal Design $$ \mathbf{s}^* = \arg\max_{\mathbf{s}} \det\left(\mathbf{X}_s^T \mathbf{X}_s\right) $$ Minimizes the volume of the confidence ellipsoid for parameter estimates. #### 6.2.2 Information-Theoretic Approach Maximize expected information gain: $$ I(\mathbf{s}) = H(\mathbf{p}) - \mathbb{E}_{\mathbf{y}}\left[H(\mathbf{p}|\mathbf{y})\right] $$ ## 7. Machine Learning Integration ### 7.1 Physics-Informed Neural Networks (PINNs) Combine data fitting with physical constraints: $$ \mathcal{L} = \mathcal{L}_{\text{data}} + \lambda \mathcal{L}_{\text{physics}} $$ **Components:** - **Data loss:** $$ \mathcal{L}_{\text{data}} = \frac{1}{N} \sum_{i=1}^{N} \left\| y_i - f_\theta(\mathbf{x}_i) \right\|^2 $$ - **Physics loss (example: Maxwell residual):** $$ \mathcal{L}_{\text{physics}} = \frac{1}{M} \sum_{j=1}^{M} \left\| \nabla \times \mathbf{E}_\theta - i\omega\mu\mathbf{H}_\theta \right\|^2 $$ ### 7.2 Neural Network Surrogates **Architecture for forward model approximation:** - **Input:** Geometric parameters $\mathbf{p} \in \mathbb{R}^d$ - **Hidden layers:** Multiple fully-connected layers with ReLU/GELU activation - **Output:** Simulated spectrum $\mathbf{y} \in \mathbb{R}^m$ **Speedup:** $10^4$ – $10^6\times$ over rigorous simulation ### 7.3 Deep Learning for Defect Detection **Methods:** - **CNNs** — Classification and localization - **Autoencoders** — Anomaly detection via reconstruction error: $$ \text{Score}(\mathbf{x}) = \left\| \mathbf{x} - D(E(\mathbf{x})) \right\|^2 $$ - **Instance segmentation** — Precise defect boundary delineation ## 8. Uncertainty Quantification ### 8.1 GUM Framework (Guide to Uncertainty in Measurement) Combined standard uncertainty: $$ u_c^2(y) = \sum_{i} \left(\frac{\partial f}{\partial x_i}\right)^2 u^2(x_i) + 2\sum_{i

micro search space, neural architecture search

Micro search spaces focus on small components like operations within cells enabling efficient architecture optimization.

micro-batch, distributed training

Small batch processed at once.

micro-ct, failure analysis advanced

Micro-computed tomography creates 3D reconstructions of package internals with micron-scale resolution.

micronet challenge, edge ai

Competition for efficient models.

middle man, code ai

Class delegating everything.

midjourney, multimodal ai

Midjourney generates artistic images from text prompts using proprietary diffusion-based models.

milk run, supply chain & logistics

Milk run logistics uses regular routes collecting materials from multiple suppliers reducing transportation costs.

millisecond anneal,diffusion

Ultra-fast anneal using lasers or flash lamps.

min tokens, llm optimization

Min tokens ensures generation continues until minimum length.

min-p sampling, llm optimization

Min-p sampling sets minimum probability relative to top token.

mincut pool, graph neural networks

MinCut pooling learns cluster assignments by minimizing normalized cut objectives creating coarsened graphs with balanced communities.

mini-batch online learning,machine learning

Update with small batches of streaming data.

minigpt-4,multimodal ai

Vision-language model aligned with GPT-4.

mip-nerf, multimodal ai

Mip-NeRF anti-aliases NeRF by integrating over conical frustums rather than points.

mirostat, llm optimization

Mirostat dynamically adjusts temperature maintaining target perplexity.

mish, neural architecture

Smooth activation x*tanh(softplus(x)).

missing modality handling, multimodal ai

Handle incomplete multimodal data.

mistral,foundation model

Efficient open-source language model with sliding window attention.

mixed integer linear programming verification, milp, ai safety

Encode network as MILP for verification.

mixed model production, manufacturing operations

Mixed model production manufactures multiple products on same line enabling variety without dedicated resources.

mixed precision training,model training

Use lower precision (FP16) for some operations to speed up and save memory.

mixed-precision training, model optimization

Mixed-precision training uses different numeric precisions for different operations balancing speed and accuracy.

mixmatch, advanced training

MixMatch unifies consistency regularization entropy minimization and MixUp for semi-supervised learning with unlabeled data.

mixtral,foundation model

Mixture of Experts version of Mistral.

mixture of depths (mod),mixture of depths,mod,llm architecture

Dynamic computation allocation across layers based on input complexity.

mixture of depths advanced, llm architecture

Dynamically allocate computation across transformer layers based on token importance.

mixture of depths, llm architecture

Mixture of depths dynamically allocates computation across layers per token.

mixture of experts (moe),mixture of experts,moe,model architecture

Route each input to a few specialized expert networks instead of all parameters.

mixup text, advanced training

MixUp for text combines embeddings of two examples and interpolates their labels for training with continuous semantic augmentation.

mlc llm,universal,compile

MLC LLM provides universal LLM deployment. Compile to any device.

mlops,model registry,rollback

I can outline MLOps flows: versioning models, registries, canary deploys, rollback, and monitoring for drift.

mnasnet, neural architecture search

MnasNet performs mobile neural architecture search optimizing accuracy and latency on target devices using reinforcement learning.

mobilenet, model optimization

MobileNet architecture uses depthwise separable convolutions for efficient mobile deployment.

mobilenetv2, model optimization

MobileNetV2 adds inverted residuals and linear bottlenecks improving efficiency and accuracy.

mobilenetv3, model optimization

MobileNetV3 uses NAS-discovered architectures with squeeze-excitation and h-swish activation.

mobility modeling, simulation

Simulate carrier mobility.

mock generation, code ai

Generate mock objects for testing.

modality dropout, multimodal ai

Randomly drop modalities during training.

modality hallucination, multimodal ai

Generate missing modalities.

mode interpolation, model merging

Blend different optima.

model access control,security

Restrict who can use or modify models.