embeddings in diffusion, generative models
Learned concept representations.
132 technical terms and definitions
Learned concept representations.
AI agents that interact with physical world.
Urgent unplanned repairs.
Capabilities appearing at scale.
Capabilities that appear in large models but not small ones (reasoning multi-step problems).
# Semiconductor Manufacturing Process: Emerging Mathematical Frontiers ## 1. Computational Lithography and Inverse Problems ### 1.1 Inverse Lithography Technology (ILT) The fundamental problem: Given a desired wafer pattern $I_{\text{target}}(x,y)$, find the optimal mask pattern $M(x',y')$. **Core Mathematical Formulation:** $$ \min_{M} \mathcal{L}(M) = \int \left| I(x,y; M) - I_{\text{target}}(x,y) \right|^2 \, dx \, dy + \lambda \mathcal{R}(M) $$ Where: - $I(x,y; M)$ = Aerial image intensity on wafer - $I_{\text{target}}(x,y)$ = Desired pattern intensity - $\mathcal{R}(M)$ = Regularization term (mask manufacturability) - $\lambda$ = Regularization parameter **Key Challenges:** - **Dimensionality:** Full-chip optimization involves $N \sim 10^9$ to $10^{12}$ variables - **Non-convexity:** The forward model $I(x,y; M)$ is highly nonlinear - **Ill-posedness:** Multiple masks can produce similar images **Hopkins Imaging Model:** $$ I(x,y) = \sum_{k} \left| \int \int H_k(f_x, f_y) \cdot \tilde{M}(f_x, f_y) \cdot e^{2\pi i (f_x x + f_y y)} \, df_x \, df_y \right|^2 $$ Where: - $H_k(f_x, f_y)$ = Transmission cross-coefficient (TCC) eigenfunctions - $\tilde{M}(f_x, f_y)$ = Fourier transform of mask transmission ### 1.2 Source-Mask Optimization (SMO) **Bilinear Optimization Problem:** $$ \min_{S, M} \mathcal{L}(S, M) = \| I(S, M) - I_{\text{target}} \|^2 + \alpha \mathcal{R}_S(S) + \beta \mathcal{R}_M(M) $$ Where: - $S$ = Source intensity distribution (illumination pupil) - $M$ = Mask transmission function - $\mathcal{R}_S$, $\mathcal{R}_M$ = Source and mask regularizers **Alternating Minimization Approach:** 1. Fix $S^{(k)}$, solve: $M^{(k+1)} = \arg\min_M \mathcal{L}(S^{(k)}, M)$ 2. Fix $M^{(k+1)}$, solve: $S^{(k+1)} = \arg\min_S \mathcal{L}(S, M^{(k+1)})$ 3. Repeat until convergence ### 1.3 Stochastic Lithography Effects At EUV wavelengths ($\lambda = 13.5$ nm), photon shot noise becomes critical. **Photon Statistics:** $$ N_{\text{photons}} \sim \text{Poisson}\left( \frac{E \cdot A}{h\nu} \right) $$ Where: - $E$ = Exposure dose (mJ/cm²) - $A$ = Pixel area - $h\nu$ = Photon energy ($\approx 92$ eV for EUV) **Line Edge Roughness (LER) Model:** $$ \text{LER} = \sqrt{\sigma_{\text{shot}}^2 + \sigma_{\text{resist}}^2 + \sigma_{\text{acid}}^2} $$ **Stochastic Resist Development (Stochastic PDE):** $$ \frac{\partial h}{\partial t} = -R(M, I, \xi) + \eta(x, y, t) $$ Where: - $h(x,y,t)$ = Resist height - $R$ = Development rate (depends on local deprotection $M$, inhibitor $I$) - $\eta$ = Spatiotemporal noise term - $\xi$ = Quenched disorder from shot noise ## 2. Physics-Informed Machine Learning ### 2.1 Physics-Informed Neural Networks (PINNs) **Standard PINN Loss Function:** $$ \mathcal{L}_{\text{PINN}} = \mathcal{L}_{\text{data}} + \lambda_{\text{PDE}} \mathcal{L}_{\text{PDE}} + \lambda_{\text{BC}} \mathcal{L}_{\text{BC}} $$ Where: - $\mathcal{L}_{\text{data}} = \frac{1}{N_d} \sum_{i=1}^{N_d} |u_\theta(x_i) - u_i^{\text{obs}}|^2$ - $\mathcal{L}_{\text{PDE}} = \frac{1}{N_r} \sum_{j=1}^{N_r} |\mathcal{N}[u_\theta](x_j)|^2$ - $\mathcal{L}_{\text{BC}} = \frac{1}{N_b} \sum_{k=1}^{N_b} |\mathcal{B}[u_\theta](x_k) - g_k|^2$ **Key Mathematical Questions:** - **Approximation Theory:** What function classes can $u_\theta$ represent under PDE constraints? - **Generalization Bounds:** How does enforcing physics improve out-of-distribution performance? ### 2.2 Neural Operators **Fourier Neural Operator (FNO):** $$ v_{l+1}(x) = \sigma \left( W_l v_l(x) + \mathcal{F}^{-1}\left( R_l \cdot \mathcal{F}(v_l) \right)(x) \right) $$ Where: - $\mathcal{F}$, $\mathcal{F}^{-1}$ = Fourier and inverse Fourier transforms - $R_l$ = Learnable spectral weights - $W_l$ = Local linear transformation - $\sigma$ = Activation function **DeepONet Architecture:** $$ G_\theta(u)(y) = \sum_{k=1}^{p} b_k(u; \theta_b) \cdot t_k(y; \theta_t) $$ Where: - $b_k$ = Branch network outputs (encode input function $u$) - $t_k$ = Trunk network outputs (encode query location $y$) ### 2.3 Hybrid Physics-ML Architectures **Residual Learning Framework:** $$ u_{\text{full}}(x) = u_{\text{physics}}(x) + u_{\text{NN}}(x; \theta) $$ Where the neural network learns the "correction" to the physics model: $$ u_{\text{NN}} \approx u_{\text{true}} - u_{\text{physics}} $$ **Constraint: Physics Consistency** $$ \| \mathcal{N}[u_{\text{full}}] \|_2 \leq \epsilon $$ ## 3. High-Dimensional Uncertainty Quantification ### 3.1 Polynomial Chaos Expansions (PCE) **Generalized PCE Representation:** $$ u(\mathbf{x}, \boldsymbol{\xi}) = \sum_{\boldsymbol{\alpha} \in \mathcal{A}} c_{\boldsymbol{\alpha}}(\mathbf{x}) \Psi_{\boldsymbol{\alpha}}(\boldsymbol{\xi}) $$ Where: - $\boldsymbol{\xi} = (\xi_1, \ldots, \xi_d)$ = Random variables (process variations) - $\Psi_{\boldsymbol{\alpha}}$ = Multivariate orthogonal polynomials - $\boldsymbol{\alpha} = (\alpha_1, \ldots, \alpha_d)$ = Multi-index - $\mathcal{A}$ = Index set (truncated) **Orthogonality Condition:** $$ \mathbb{E}[\Psi_{\boldsymbol{\alpha}} \Psi_{\boldsymbol{\beta}}] = \int \Psi_{\boldsymbol{\alpha}}(\boldsymbol{\xi}) \Psi_{\boldsymbol{\beta}}(\boldsymbol{\xi}) \rho(\boldsymbol{\xi}) \, d\boldsymbol{\xi} = \delta_{\boldsymbol{\alpha}\boldsymbol{\beta}} $$ **Curse of Dimensionality:** - Full tensor product: $|\mathcal{A}| = \binom{d + p}{p} \sim \frac{d^p}{p!}$ - Sparse grids: $|\mathcal{A}| \sim \mathcal{O}(d \cdot (\log d)^{d-1})$ ### 3.2 Rare Event Simulation **Importance Sampling:** $$ P(Y > \gamma) = \mathbb{E}_P[\mathbf{1}_{Y > \gamma}] = \mathbb{E}_Q\left[ \mathbf{1}_{Y > \gamma} \cdot \frac{dP}{dQ} \right] $$ **Optimal Tilting Measure:** $$ Q^*(\xi) \propto \mathbf{1}_{Y(\xi) > \gamma} \cdot P(\xi) $$ **Large Deviation Principle:** $$ \lim_{n \to \infty} \frac{1}{n} \log P(S_n / n \in A) = -\inf_{x \in A} I(x) $$ Where $I(x)$ is the rate function (Legendre transform of cumulant generating function). ### 3.3 Distributionally Robust Optimization **Wasserstein Ambiguity Set:** $$ \mathcal{P} = \left\{ Q : W_p(Q, \hat{P}_n) \leq \epsilon \right\} $$ **DRO Formulation:** $$ \min_{x} \sup_{Q \in \mathcal{P}} \mathbb{E}_Q[f(x, \xi)] $$ **Tractable Reformulation (for linear $f$):** $$ \min_{x} \left\{ \frac{1}{n} \sum_{i=1}^{n} f(x, \hat{\xi}_i) + \epsilon \cdot \| \nabla_\xi f \|_* \right\} $$ ## 4. Multiscale Mathematics ### 4.1 Scale Hierarchy in Semiconductor Manufacturing | Scale | Size Range | Phenomena | Mathematical Tools | |-------|------------|-----------|---------------------| | Atomic | 0.1 - 1 nm | Dopant atoms, ALD | DFT, MD, KMC | | Mesoscale | 1 - 10 nm | LER, grain structure | Phase field, SDE | | Feature | 10 - 100 nm | Transistors, vias | Continuum PDEs | | Die | 1 - 10 mm | Pattern loading | Effective medium | | Wafer | 300 mm | Uniformity | Process models | ### 4.2 Homogenization Theory **Two-Scale Expansion:** $$ u^\epsilon(x) = u_0(x, x/\epsilon) + \epsilon u_1(x, x/\epsilon) + \epsilon^2 u_2(x, x/\epsilon) + \ldots $$ Where $y = x/\epsilon$ is the fast variable. **Cell Problem:** $$ -\nabla_y \cdot \left( A(y) \left( \nabla_y \chi^j + \mathbf{e}_j \right) \right) = 0 \quad \text{in } Y $$ **Effective (Homogenized) Coefficient:** $$ A^*_{ij} = \frac{1}{|Y|} \int_Y A(y) \left( \mathbf{e}_i + \nabla_y \chi^i \right) \cdot \left( \mathbf{e}_j + \nabla_y \chi^j \right) \, dy $$ ### 4.3 Phase Field Methods **Allen-Cahn Equation (Interface Evolution):** $$ \frac{\partial \phi}{\partial t} = -M \frac{\delta \mathcal{F}}{\delta \phi} = M \left( \epsilon^2 \nabla^2 \phi - f'(\phi) \right) $$ **Cahn-Hilliard Equation (Conserved Order Parameter):** $$ \frac{\partial c}{\partial t} = \nabla \cdot \left( M \nabla \frac{\delta \mathcal{F}}{\delta c} \right) $$ **Free Energy Functional:** $$ \mathcal{F}[\phi] = \int \left( \frac{\epsilon^2}{2} |\nabla \phi|^2 + f(\phi) \right) dV $$ Where $f(\phi) = \frac{1}{4}(\phi^2 - 1)^2$ (double-well potential). ### 4.4 Kinetic Monte Carlo (KMC) **Master Equation:** $$ \frac{dP(\sigma, t)}{dt} = \sum_{\sigma'} \left[ W(\sigma' \to \sigma) P(\sigma', t) - W(\sigma \to \sigma') P(\sigma, t) \right] $$ **Transition Rates (Arrhenius Form):** $$ W_i = \nu_0 \exp\left( -\frac{E_a^{(i)}}{k_B T} \right) $$ **BKL Algorithm:** 1. Calculate total rate: $R_{\text{tot}} = \sum_i W_i$ 2. Select event $i$ with probability: $p_i = W_i / R_{\text{tot}}$ 3. Advance time: $\Delta t = -\frac{\ln(r)}{R_{\text{tot}}}$, where $r \sim U(0,1)$ ## 5. Optimization at Unprecedented Scale ### 5.1 Bayesian Optimization **Gaussian Process Prior:** $$ f(\mathbf{x}) \sim \mathcal{GP}\left( m(\mathbf{x}), k(\mathbf{x}, \mathbf{x}') \right) $$ **Posterior Mean and Variance:** $$ \mu_n(\mathbf{x}) = \mathbf{k}_n(\mathbf{x})^T \mathbf{K}_n^{-1} \mathbf{y}_n $$ $$ \sigma_n^2(\mathbf{x}) = k(\mathbf{x}, \mathbf{x}) - \mathbf{k}_n(\mathbf{x})^T \mathbf{K}_n^{-1} \mathbf{k}_n(\mathbf{x}) $$ **Expected Improvement (EI):** $$ \text{EI}(\mathbf{x}) = \mathbb{E}\left[ \max(0, f(\mathbf{x}) - f_{\text{best}}) \right] $$ $$ = \sigma_n(\mathbf{x}) \left[ z \Phi(z) + \phi(z) \right], \quad z = \frac{\mu_n(\mathbf{x}) - f_{\text{best}}}{\sigma_n(\mathbf{x})} $$ ### 5.2 High-Dimensional Extensions **Random Embeddings:** $$ f(\mathbf{x}) \approx g(\mathbf{A}\mathbf{x}), \quad \mathbf{A} \in \mathbb{R}^{d_e \times D}, \quad d_e \ll D $$ **Additive Structure:** $$ f(\mathbf{x}) = \sum_{j=1}^{J} f_j(\mathbf{x}_{S_j}) $$ Where $S_j \subset \{1, \ldots, D\}$ are (possibly overlapping) subsets. **Trust Region Bayesian Optimization (TuRBO):** - Maintain local GP models within trust regions - Expand/contract regions based on success/failure - Multiple trust regions for multimodal landscapes ### 5.3 Multi-Objective Optimization **Pareto Optimality:** $\mathbf{x}^*$ is Pareto optimal if $\nexists \mathbf{x}$ such that: $$ f_i(\mathbf{x}) \leq f_i(\mathbf{x}^*) \; \forall i \quad \text{and} \quad f_j(\mathbf{x}) < f_j(\mathbf{x}^*) \; \text{for some } j $$ **Expected Hypervolume Improvement (EHVI):** $$ \text{EHVI}(\mathbf{x}) = \mathbb{E}\left[ \text{HV}(\mathcal{P} \cup \{f(\mathbf{x})\}) - \text{HV}(\mathcal{P}) \right] $$ Where $\mathcal{P}$ is the current Pareto front and HV is the hypervolume indicator. ## 6. Topological and Geometric Methods ### 6.1 Persistent Homology **Simplicial Complex Filtration:** $$ \emptyset = K_0 \subseteq K_1 \subseteq K_2 \subseteq \cdots \subseteq K_n = K $$ **Persistence Pairs:** For each topological feature (connected component, loop, void): - **Birth time:** $b_i$ = scale at which feature appears - **Death time:** $d_i$ = scale at which feature disappears - **Persistence:** $\text{pers}_i = d_i - b_i$ **Persistence Diagram:** $$ \text{Dgm}(K) = \{(b_i, d_i)\}_{i=1}^{N} \subset \mathbb{R}^2 $$ **Stability Theorem:** $$ d_B(\text{Dgm}(K), \text{Dgm}(K')) \leq \| f - f' \|_\infty $$ Where $d_B$ is the bottleneck distance. ### 6.2 Optimal Transport **Monge Problem:** $$ \min_{T: T_\# \mu = \nu} \int c(x, T(x)) \, d\mu(x) $$ **Kantorovich (Relaxed) Formulation:** $$ W_p(\mu, \nu) = \left( \inf_{\gamma \in \Gamma(\mu, \nu)} \int |x - y|^p \, d\gamma(x, y) \right)^{1/p} $$ **Applications in Semiconductor:** - Comparing wafer defect maps - Loss functions for lithography optimization - Generative models for realistic defect distributions ### 6.3 Curvature-Driven Flows **Mean Curvature Flow:** $$ \frac{\partial \Gamma}{\partial t} = \kappa \mathbf{n} $$ Where $\kappa$ is the mean curvature and $\mathbf{n}$ is the unit normal. **Level Set Formulation:** $$ \frac{\partial \phi}{\partial t} + v_n |\nabla \phi| = 0 $$ With $v_n = \kappa = \nabla \cdot \left( \frac{\nabla \phi}{|\nabla \phi|} \right)$. **Surface Diffusion (4th Order):** $$ \frac{\partial \Gamma}{\partial t} = -\Delta_s \kappa \cdot \mathbf{n} $$ Where $\Delta_s$ is the surface Laplacian. ## 7. Control Theory and Real-Time Optimization ### 7.1 Run-to-Run Control **State-Space Model:** $$ \mathbf{x}_{k+1} = \mathbf{A} \mathbf{x}_k + \mathbf{B} \mathbf{u}_k + \mathbf{w}_k $$ $$ \mathbf{y}_k = \mathbf{C} \mathbf{x}_k + \mathbf{v}_k $$ **EWMA (Exponentially Weighted Moving Average) Controller:** $$ \hat{y}_{k+1} = \lambda y_k + (1 - \lambda) \hat{y}_k $$ $$ u_{k+1} = u_k + \frac{T - \hat{y}_{k+1}}{\beta} $$ Where: - $T$ = Target value - $\lambda$ = EWMA weight (0 < λ ≤ 1) - $\beta$ = Process gain ### 7.2 Model Predictive Control (MPC) **Optimization Problem at Each Step:** $$ \min_{\mathbf{u}_{0:N-1}} \sum_{k=0}^{N-1} \left[ \| \mathbf{x}_k - \mathbf{x}_{\text{ref}} \|_Q^2 + \| \mathbf{u}_k \|_R^2 \right] + \| \mathbf{x}_N \|_P^2 $$ Subject to: $$ \mathbf{x}_{k+1} = f(\mathbf{x}_k, \mathbf{u}_k) $$ $$ \mathbf{x}_k \in \mathcal{X}, \quad \mathbf{u}_k \in \mathcal{U} $$ **Robust MPC (Tube-Based):** $$ \mathbf{x}_k = \bar{\mathbf{x}}_k + \mathbf{e}_k, \quad \mathbf{e}_k \in \mathcal{E} $$ Where $\bar{\mathbf{x}}_k$ is the nominal trajectory and $\mathcal{E}$ is the robust positively invariant set. ### 7.3 Kalman Filter **Prediction Step:** $$ \hat{\mathbf{x}}_{k|k-1} = \mathbf{A} \hat{\mathbf{x}}_{k-1|k-1} + \mathbf{B} \mathbf{u}_{k-1} $$ $$ \mathbf{P}_{k|k-1} = \mathbf{A} \mathbf{P}_{k-1|k-1} \mathbf{A}^T + \mathbf{Q} $$ **Update Step:** $$ \mathbf{K}_k = \mathbf{P}_{k|k-1} \mathbf{C}^T \left( \mathbf{C} \mathbf{P}_{k|k-1} \mathbf{C}^T + \mathbf{R} \right)^{-1} $$ $$ \hat{\mathbf{x}}_{k|k} = \hat{\mathbf{x}}_{k|k-1} + \mathbf{K}_k \left( \mathbf{y}_k - \mathbf{C} \hat{\mathbf{x}}_{k|k-1} \right) $$ $$ \mathbf{P}_{k|k} = \left( \mathbf{I} - \mathbf{K}_k \mathbf{C} \right) \mathbf{P}_{k|k-1} $$ ## 8. Metrology Inverse Problems ### 8.1 Scatterometry (Optical CD) **Forward Problem (RCWA):** $$ \frac{\partial}{\partial z} \begin{pmatrix} \mathbf{E}_\perp \\ \mathbf{H}_\perp \end{pmatrix} = \mathbf{M}(z) \begin{pmatrix} \mathbf{E}_\perp \\ \mathbf{H}_\perp \end{pmatrix} $$ **Inverse Problem:** $$ \min_{\mathbf{p}} \| \mathbf{S}(\mathbf{p}) - \mathbf{S}_{\text{meas}} \|^2 + \lambda \mathcal{R}(\mathbf{p}) $$ Where: - $\mathbf{p}$ = Geometric parameters (CD, height, sidewall angle) - $\mathbf{S}$ = Mueller matrix elements - $\mathcal{R}$ = Regularizer (e.g., Tikhonov, total variation) ### 8.2 Phase Retrieval **Measurement Model:** $$ I_m = |\mathcal{A}_m x|^2, \quad m = 1, \ldots, M $$ **Wirtinger Flow:** $$ x^{(k+1)} = x^{(k)} - \frac{\mu_k}{M} \sum_{m=1}^{M} \left( |a_m^H x^{(k)}|^2 - I_m \right) a_m a_m^H x^{(k)} $$ **Uniqueness Conditions:** For $x \in \mathbb{C}^n$, uniqueness (up to global phase) requires $M \geq 4n - 4$ generic measurements. ### 8.3 Information-Theoretic Limits **Cramér-Rao Lower Bound:** $$ \text{Var}(\hat{\theta}_i) \geq \left[ \mathbf{I}(\boldsymbol{\theta})^{-1} \right]_{ii} $$ **Fisher Information Matrix:** $$ [\mathbf{I}(\boldsymbol{\theta})]_{ij} = -\mathbb{E}\left[ \frac{\partial^2 \log p(y | \boldsymbol{\theta})}{\partial \theta_i \partial \theta_j} \right] $$ **Optimal Experimental Design:** $$ \max_{\xi} \Phi(\mathbf{I}(\boldsymbol{\theta}; \xi)) $$ Where $\xi$ = experimental design, $\Phi$ = optimality criterion (D-optimal: $\det(\mathbf{I})$, A-optimal: $\text{tr}(\mathbf{I}^{-1})$) ## 9. Quantum-Classical Boundaries ### 9.1 Non-Equilibrium Green's Functions (NEGF) **Dyson Equation:** $$ G^R(E) = \left[ (E + i\eta)I - H - \Sigma^R(E) \right]^{-1} $$ **Current Calculation:** $$ I = \frac{2e}{h} \int_{-\infty}^{\infty} T(E) \left[ f_L(E) - f_R(E) \right] dE $$ **Transmission Function:** $$ T(E) = \text{Tr}\left[ \Gamma_L G^R \Gamma_R G^A \right] $$ Where $\Gamma_{L,R} = i(\Sigma_{L,R}^R - \Sigma_{L,R}^A)$. ### 9.2 Density Functional Theory (DFT) **Kohn-Sham Equations:** $$ \left[ -\frac{\hbar^2}{2m} \nabla^2 + V_{\text{eff}}(\mathbf{r}) \right] \psi_i(\mathbf{r}) = \epsilon_i \psi_i(\mathbf{r}) $$ **Effective Potential:** $$ V_{\text{eff}}(\mathbf{r}) = V_{\text{ext}}(\mathbf{r}) + V_H(\mathbf{r}) + V_{xc}(\mathbf{r}) $$ Where: - $V_{\text{ext}}$ = External (ionic) potential - $V_H = \int \frac{n(\mathbf{r}')}{|\mathbf{r} - \mathbf{r}'|} d\mathbf{r}'$ = Hartree potential - $V_{xc} = \frac{\delta E_{xc}[n]}{\delta n}$ = Exchange-correlation potential ### 9.3 Semiclassical Approximations **WKB Approximation:** $$ \psi(x) \approx \frac{C}{\sqrt{p(x)}} \exp\left( \pm \frac{i}{\hbar} \int^x p(x') \, dx' \right) $$ Where $p(x) = \sqrt{2m(E - V(x))}$. **Validity Criterion:** $$ \left| \frac{d\lambda}{dx} \right| \ll 1, \quad \text{where } \lambda = \frac{h}{p} $$ **Tunneling Probability (WKB):** $$ T \approx \exp\left( -\frac{2}{\hbar} \int_{x_1}^{x_2} |p(x)| \, dx \right) $$ ## 10. Graph and Combinatorial Methods ### 10.1 Design Rule Checking (DRC) **Constraint Satisfaction Problem (CSP):** $$ \forall (i,j) \in E: \; d(p_i, p_j) \geq d_{\min}(t_i, t_j) $$ Where: - $p_i, p_j$ = Polygon features - $d$ = Distance function (min spacing, enclosure, etc.) - $t_i, t_j$ = Layer/feature types **SAT/SMT Encoding:** $$ \bigwedge_{r \in \text{Rules}} \bigwedge_{(i,j) \in \text{Violations}(r)} \neg(x_i \land x_j) $$ ### 10.2 Graph Neural Networks for Layout **Message Passing Framework:** $$ \mathbf{h}_v^{(k+1)} = \text{UPDATE}^{(k)} \left( \mathbf{h}_v^{(k)}, \text{AGGREGATE}^{(k)} \left( \left\{ \mathbf{h}_u^{(k)} : u \in \mathcal{N}(v) \right\} \right) \right) $$ **Graph Attention:** $$ \alpha_{vu} = \frac{\exp\left( \text{LeakyReLU}(\mathbf{a}^T [\mathbf{W}\mathbf{h}_v \| \mathbf{W}\mathbf{h}_u]) \right)}{\sum_{w \in \mathcal{N}(v)} \exp\left( \text{LeakyReLU}(\mathbf{a}^T [\mathbf{W}\mathbf{h}_v \| \mathbf{W}\mathbf{h}_w]) \right)} $$ $$ \mathbf{h}_v' = \sigma\left( \sum_{u \in \mathcal{N}(v)} \alpha_{vu} \mathbf{W} \mathbf{h}_u \right) $$ ### 10.3 Hypergraph Partitioning **Min-Cut Objective:** $$ \min_{\pi: V \to \{1, \ldots, k\}} \sum_{e \in E} w_e \cdot \mathbf{1}[\text{cut}(e, \pi)] $$ Subject to balance constraints: $$ \left| |\pi^{-1}(i)| - \frac{|V|}{k} \right| \leq \epsilon \frac{|V|}{k} $$ ## Cross-Cutting Mathematical Themes ### Theme 1: Curse of Dimensionality **Tensor Train Decomposition:** $$ \mathcal{T}(i_1, \ldots, i_d) = G_1(i_1) \cdot G_2(i_2) \cdots G_d(i_d) $$ - Storage: $\mathcal{O}(dnr^2)$ vs. $\mathcal{O}(n^d)$ - Where $r$ = TT-rank ### Theme 2: Inverse Problems Framework $$ \mathbf{y} = \mathcal{A}(\mathbf{x}) + \boldsymbol{\eta} $$ **Regularized Solution:** $$ \hat{\mathbf{x}} = \arg\min_{\mathbf{x}} \| \mathbf{y} - \mathcal{A}(\mathbf{x}) \|^2 + \lambda \mathcal{R}(\mathbf{x}) $$ Common regularizers: - Tikhonov: $\mathcal{R}(\mathbf{x}) = \|\mathbf{x}\|_2^2$ - Total Variation: $\mathcal{R}(\mathbf{x}) = \|\nabla \mathbf{x}\|_1$ - Sparsity: $\mathcal{R}(\mathbf{x}) = \|\mathbf{x}\|_1$ ### Theme 3: Certification and Trust **PAC-Bayes Bound:** $$ \mathbb{E}_{h \sim Q}[L(h)] \leq \mathbb{E}_{h \sim Q}[\hat{L}(h)] + \sqrt{\frac{\text{KL}(Q \| P) + \ln(2\sqrt{n}/\delta)}{2n}} $$ **Conformal Prediction:** $$ C(x_{\text{new}}) = \{y : s(x_{\text{new}}, y) \leq \hat{q}\} $$ Where $\hat{q}$ = $(1-\alpha)$-quantile of calibration scores. ## Key Notation Summary | Symbol | Meaning | |--------|---------| | $M(x,y)$ | Mask transmission function | | $I(x,y)$ | Aerial image intensity | | $\mathcal{F}$ | Fourier transform | | $\nabla$ | Gradient operator | | $\nabla^2$, $\Delta$ | Laplacian | | $\mathbb{E}[\cdot]$ | Expectation | | $\mathcal{GP}(m, k)$ | Gaussian process with mean $m$, covariance $k$ | | $\mathcal{N}(\mu, \sigma^2)$ | Normal distribution | | $W_p(\mu, \nu)$ | $p$-Wasserstein distance | | $\text{Tr}(\cdot)$ | Matrix trace | | $\|\cdot\|_p$ | $L^p$ norm | | $\delta_{ij}$ | Kronecker delta | | $\mathbf{1}_{A}$ | Indicator function of set $A$ |
Detect light from active devices.
Efficient Neural Architecture Search reduces computational cost by sharing weights among child models through a single directed acyclic graph.
Encoder-based inversion directly maps images to latent codes through learned networks.
Use encoder to find latent.
Full Transformer with both parts (T5 BART for seq2seq tasks).
Architecture with only encoder blocks (BERT for classification/understanding).
Wearout failures.
Energy efficiency in fabs focuses on reducing power consumption per wafer through equipment optimization process improvements and facility design.
Energy-aware NAS optimizes architectures for minimal power consumption during inference.
Energy-based models assign scalar energy values to input-output configurations and make predictions by minimizing energy functions.
Models that assign energy to configurations.
Improved decoding for masked positions.
Improve MD sampling efficiency.
Ensemble Kalman Filter uses Monte Carlo samples to represent state distributions in high-dimensional systems.
Combine multiple models for better predictions.
Ensembling combines multiple models. Bagging, boosting, stacking.
Enthalpy wheels transfer both sensible and latent heat recovering moisture and thermal energy.
Encourage diverse predictions.
Engineer enzymes for specific reactions.
# Semiconductor Manufacturing Process: Epitaxy (Epi) Modeling ## 1. Introduction to Epitaxy Epitaxy is the controlled growth of a crystalline thin film on a crystalline substrate, where the deposited layer inherits the crystallographic orientation of the substrate. ### 1.1 Types of Epitaxy - **Homoepitaxy** - Same material deposited on substrate - Example: Silicon (Si) on Silicon (Si) - Maintains perfect lattice matching - Used for creating high-purity device layers - **Heteroepitaxy** - Different material deposited on substrate - Examples: - Gallium Arsenide (GaAs) on Silicon (Si) - Silicon Germanium (SiGe) on Silicon (Si) - Gallium Nitride (GaN) on Sapphire ($\text{Al}_2\text{O}_3$) - Introduces lattice mismatch and strain - Enables bandgap engineering ## 2. Epitaxy Methods ### 2.1 Chemical Vapor Deposition (CVD) / Vapor Phase Epitaxy (VPE) - **Characteristics:** - Most common method for silicon epitaxy - Operates at atmospheric or reduced pressure - Temperature range: $900°\text{C} - 1200°\text{C}$ - **Common Precursors:** - Silane: $\text{SiH}_4$ - Dichlorosilane: $\text{SiH}_2\text{Cl}_2$ (DCS) - Trichlorosilane: $\text{SiHCl}_3$ (TCS) - Silicon tetrachloride: $\text{SiCl}_4$ - **Key Reactions:** $$\text{SiH}_4 \xrightarrow{\Delta} \text{Si}_{(s)} + 2\text{H}_2$$ $$\text{SiH}_2\text{Cl}_2 \xrightarrow{\Delta} \text{Si}_{(s)} + 2\text{HCl}$$ ### 2.2 Molecular Beam Epitaxy (MBE) - **Characteristics:** - Ultra-high vacuum environment ($< 10^{-10}$ Torr) - Extremely precise thickness control (monolayer accuracy) - Lower growth temperatures than CVD - Slower growth rates: $\sim 1 \, \mu\text{m/hour}$ - **Applications:** - III-V compound semiconductors - Quantum well structures - Superlattices - Research and development ### 2.3 Metal-Organic CVD (MOCVD) - **Characteristics:** - Standard for compound semiconductors - Uses metal-organic precursors - Higher throughput than MBE - **Common Precursors:** - Trimethylgallium: $\text{Ga(CH}_3\text{)}_3$ (TMGa) - Trimethylaluminum: $\text{Al(CH}_3\text{)}_3$ (TMAl) - Ammonia: $\text{NH}_3$ ### 2.4 Atomic Layer Epitaxy (ALE) - **Characteristics:** - Self-limiting surface reactions - Digital control of film thickness - Excellent conformality - Growth rate: $\sim 1$ Å per cycle ## 3. Physics of Epi Modeling ### 3.1 Gas-Phase Transport The transport of precursor gases to the substrate surface involves multiple phenomena: - **Governing Equations:** - **Continuity Equation:** $$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0$$ - **Navier-Stokes Equation:** $$\rho \left( \frac{\partial \mathbf{v}}{\partial t} + \mathbf{v} \cdot \nabla \mathbf{v} \right) = -\nabla p + \mu \nabla^2 \mathbf{v} + \rho \mathbf{g}$$ - **Species Transport Equation:** $$\frac{\partial C_i}{\partial t} + \mathbf{v} \cdot \nabla C_i = D_i \nabla^2 C_i + R_i$$ Where: - $\rho$ = fluid density - $\mathbf{v}$ = velocity vector - $p$ = pressure - $\mu$ = dynamic viscosity - $C_i$ = concentration of species $i$ - $D_i$ = diffusion coefficient of species $i$ - $R_i$ = reaction rate term - **Boundary Layer:** - Stagnant gas layer above substrate - Thickness $\delta$ depends on flow conditions: $$\delta \propto \sqrt{\frac{\nu x}{u_\infty}}$$ Where: - $\nu$ = kinematic viscosity - $x$ = distance from leading edge - $u_\infty$ = free stream velocity ### 3.2 Surface Kinetics - **Adsorption Process:** - Physisorption (weak van der Waals forces) - Chemisorption (chemical bonding) - **Langmuir Adsorption Isotherm:** $$\theta = \frac{K \cdot P}{1 + K \cdot P}$$ Where: - $\theta$ = fractional surface coverage - $K$ = equilibrium constant - $P$ = partial pressure - **Surface Diffusion:** $$D_s = D_0 \exp\left(-\frac{E_d}{k_B T}\right)$$ Where: - $D_s$ = surface diffusion coefficient - $D_0$ = pre-exponential factor - $E_d$ = diffusion activation energy - $k_B$ = Boltzmann constant ($1.38 \times 10^{-23}$ J/K) - $T$ = absolute temperature ### 3.3 Crystal Growth Mechanisms - **Step-Flow Growth (BCF Theory):** - Atoms attach at step edges - Steps advance across terraces - Dominant at high temperatures - **2D Nucleation:** - New layers nucleate on terraces - Occurs when step density is low - Creates rougher surfaces - **Terrace-Ledge-Kink (TLK) Model:** - Terrace: flat regions between steps - Ledge: step edges - Kink: incorporation sites at step edges ## 4. Mathematical Framework ### 4.1 Growth Rate Models #### 4.1.1 Reaction-Limited Regime At lower temperatures, surface reaction kinetics dominate: $$G = k_s \cdot C_s$$ Where the rate constant follows Arrhenius behavior: $$k_s = k_0 \exp\left(-\frac{E_a}{k_B T}\right)$$ **Parameters:** - $G$ = growth rate (nm/min or μm/hr) - $k_s$ = surface reaction rate constant - $C_s$ = surface concentration - $k_0$ = pre-exponential factor - $E_a$ = activation energy #### 4.1.2 Mass-Transport Limited Regime At higher temperatures, diffusion through the boundary layer limits growth: $$G = \frac{h_g}{N_s} \cdot (C_g - C_s)$$ Where: $$h_g = \frac{D}{\delta}$$ **Parameters:** - $h_g$ = mass transfer coefficient - $N_s$ = atomic density of solid ($\sim 5 \times 10^{22}$ atoms/cm³ for Si) - $C_g$ = gas phase concentration - $D$ = gas phase diffusivity - $\delta$ = boundary layer thickness #### 4.1.3 Combined Model (Grove Model) For the general case combining both regimes: $$G = \frac{h_g \cdot k_s}{N_s (h_g + k_s)} \cdot C_g$$ Or equivalently: $$\frac{1}{G} = \frac{N_s}{k_s \cdot C_g} + \frac{N_s}{h_g \cdot C_g}$$ ### 4.2 Strain in Heteroepitaxy #### 4.2.1 Lattice Mismatch $$f = \frac{a_s - a_f}{a_f}$$ Where: - $f$ = lattice mismatch (dimensionless) - $a_s$ = substrate lattice constant - $a_f$ = film lattice constant (relaxed) **Example Values:** | System | $a_f$ (Å) | $a_s$ (Å) | Mismatch $f$ | |--------|-----------|-----------|--------------| | Si on Si | 5.431 | 5.431 | 0% | | Ge on Si | 5.658 | 5.431 | -4.2% | | GaAs on Si | 5.653 | 5.431 | -4.1% | | InAs on GaAs | 6.058 | 5.653 | -7.2% | #### 4.2.2 In-Plane Strain For a coherently strained film: $$\epsilon_{\parallel} = \frac{a_s - a_f}{a_f} = f$$ The out-of-plane strain (for cubic materials): $$\epsilon_{\perp} = -\frac{2\nu}{1-\nu} \epsilon_{\parallel}$$ Where $\nu$ = Poisson's ratio #### 4.2.3 Critical Thickness (Matthews-Blakeslee) The critical thickness above which misfit dislocations form: $$h_c = \frac{b}{8\pi f (1+\nu)} \left[ \ln\left(\frac{h_c}{b}\right) + 1 \right]$$ Where: - $h_c$ = critical thickness - $b$ = Burgers vector magnitude ($\approx \frac{a}{\sqrt{2}}$ for 60° dislocations) - $f$ = lattice mismatch - $\nu$ = Poisson's ratio **Approximate Solution:** For small mismatch: $$h_c \approx \frac{b}{8\pi |f|}$$ ### 4.3 Dopant Incorporation #### 4.3.1 Segregation Model $$C_{film} = \frac{C_{gas}}{1 + k_{seg} \cdot (G/G_0)}$$ Where: - $C_{film}$ = dopant concentration in film - $C_{gas}$ = dopant concentration in gas phase - $k_{seg}$ = segregation coefficient - $G$ = growth rate - $G_0$ = reference growth rate #### 4.3.2 Dopant Profile with Segregation The surface concentration evolves as: $$C_s(t) = C_s^{eq} + (C_s(0) - C_s^{eq}) \exp\left(-\frac{G \cdot t}{\lambda}\right)$$ Where: - $\lambda$ = segregation length - $C_s^{eq}$ = equilibrium surface concentration ## 5. Modeling Approaches ### 5.1 Continuum Models - **Scope:** - Reactor-scale simulations - Temperature and flow field prediction - Species concentration profiles - **Methods:** - Computational Fluid Dynamics (CFD) - Finite Element Method (FEM) - Finite Volume Method (FVM) - **Governing Physics:** - Coupled heat, mass, and momentum transfer - Homogeneous and heterogeneous reactions - Radiation heat transfer ### 5.2 Feature-Scale Models - **Applications:** - Selective epitaxial growth (SEG) - Trench filling - Facet evolution - **Key Phenomena:** - Local loading effects: $$G_{local} = G_0 \cdot \left(1 - \alpha \cdot \frac{A_{exposed}}{A_{total}}\right)$$ - Orientation-dependent growth rates: $$\frac{G_{(110)}}{G_{(100)}} \approx 1.5 - 2.0$$ - **Methods:** - Level set methods - String methods - Cellular automata ### 5.3 Atomistic Models #### 5.3.1 Kinetic Monte Carlo (KMC) - **Process Events:** - Adsorption: rate $\propto P \cdot \exp(-E_{ads}/k_BT)$ - Surface diffusion: rate $\propto \exp(-E_{diff}/k_BT)$ - Desorption: rate $\propto \exp(-E_{des}/k_BT)$ - Incorporation: rate $\propto \exp(-E_{inc}/k_BT)$ - **Master Equation:** $$\frac{dP_i}{dt} = \sum_j \left( W_{ji} P_j - W_{ij} P_i \right)$$ Where: - $P_i$ = probability of state $i$ - $W_{ij}$ = transition rate from state $i$ to $j$ #### 5.3.2 Molecular Dynamics (MD) - **Newton's Equations:** $$m_i \frac{d^2 \mathbf{r}_i}{dt^2} = -\nabla_i U(\mathbf{r}_1, \mathbf{r}_2, ..., \mathbf{r}_N)$$ - **Interatomic Potentials:** - Tersoff potential (Si, C, Ge) - Stillinger-Weber potential (Si) - MEAM (metals and alloys) #### 5.3.3 Ab Initio / DFT - **Kohn-Sham Equations:** $$\left[ -\frac{\hbar^2}{2m} \nabla^2 + V_{eff}(\mathbf{r}) \right] \psi_i(\mathbf{r}) = \epsilon_i \psi_i(\mathbf{r})$$ - **Applications:** - Surface energies - Reaction barriers - Adsorption energies - Electronic structure ## 6. Specific Modeling Challenges ### 6.1 SiGe Epitaxy - **Composition Control:** $$x_{Ge} = \frac{R_{Ge}}{R_{Si} + R_{Ge}}$$ Where $R_{Si}$ and $R_{Ge}$ are partial growth rates - **Strain Engineering:** - Compressive strain in SiGe on Si - Enhances hole mobility - Critical thickness depends on Ge content: $$h_c(x) \approx \frac{0.5}{0.042 \cdot x} \text{ nm}$$ ### 6.2 Selective Epitaxy - **Growth Selectivity:** - Deposition only on exposed silicon - HCl addition for selectivity enhancement - **Selectivity Condition:** $$\frac{\text{Growth on Si}}{\text{Growth on SiO}_2} > 100:1$$ - **Loading Effects:** - Pattern-dependent growth rate - Faceting at mask edges ### 6.3 III-V on Silicon - **Major Challenges:** - Large lattice mismatch (4-8%) - Thermal expansion mismatch - Anti-phase domain boundaries (APDs) - High threading dislocation density - **Mitigation Strategies:** - Aspect ratio trapping (ART) - Graded buffer layers - Selective area growth - Dislocation filtering ## 7. Applications and Tools ### 7.1 Industrial Applications | Application | Material System | Key Parameters | |-------------|-----------------|----------------| | FinFET/GAA Source/Drain | Embedded SiGe, SiC | Strain, selectivity | | SiGe HBT | SiGe:C | Profile abruptness | | Power MOSFETs | SiC epitaxy | Defect density | | LEDs/Lasers | GaN, InGaN | Composition uniformity | | RF Devices | GaN on SiC | Buffer quality | ### 7.2 Simulation Software - **Reactor-Scale CFD:** - ANSYS Fluent - COMSOL Multiphysics - OpenFOAM - **TCAD Process Simulation:** - Synopsys Sentaurus Process - Silvaco Victory Process - Lumerical (for optoelectronics) - **Atomistic Simulation:** - LAMMPS (MD) - VASP, Quantum ESPRESSO (DFT) - Custom KMC codes ### 7.3 Key Metrics for Process Development - **Uniformity:** $$\text{Uniformity} = \frac{t_{max} - t_{min}}{2 \cdot t_{avg}} \times 100\%$$ - **Defect Density:** - Threading dislocations: target $< 10^6$ cm$^{-2}$ - Stacking faults: target $< 10^3$ cm$^{-2}$ - **Profile Abruptness:** - Dopant transition width $< 3$ nm/decade ## 8. Emerging Directions ### 8.1 Machine Learning Integration - **Applications:** - Surrogate models for process optimization - Real-time virtual metrology - Defect classification - Recipe optimization - **Model Types:** - Neural networks for growth rate prediction - Gaussian process regression for uncertainty quantification - Reinforcement learning for process control ### 8.2 Multi-Scale Modeling - **Hierarchical Approach:** ``` Ab Initio (DFT) ↓ Reaction rates, energies Kinetic Monte Carlo ↓ Surface kinetics, morphology Feature-Scale Models ↓ Local growth behavior Reactor-Scale CFD ↓ Process conditions Device Simulation ``` ### 8.3 Digital Twins - **Components:** - Real-time sensor data integration - Physics-based + ML hybrid models - Predictive maintenance - Closed-loop process control ### 8.4 New Material Systems - **2D Materials:** - Graphene via CVD - Transition metal dichalcogenides (TMDs) - Van der Waals epitaxy - **Ultra-Wide Bandgap:** - $\beta$-Ga$_2$O$_3$ ($E_g \approx 4.8$ eV) - Diamond ($E_g \approx 5.5$ eV) - AlN ($E_g \approx 6.2$ eV) ## Common Constants and Conversions | Constant | Symbol | Value | |----------|--------|-------| | Boltzmann constant | $k_B$ | $1.381 \times 10^{-23}$ J/K | | Planck constant | $h$ | $6.626 \times 10^{-34}$ J·s | | Avogadro number | $N_A$ | $6.022 \times 10^{23}$ mol$^{-1}$ | | Si atomic density | $N_{Si}$ | $5.0 \times 10^{22}$ atoms/cm³ | | Si lattice constant | $a_{Si}$ | 5.431 Å |
Train on sequences of few-shot tasks.
Episodic memory records specific events or experiences with temporal context.
Epistemic uncertainty arises from insufficient knowledge reducible with more data.
Uncertainty from lack of knowledge.
Epitaxial source-drain regions are grown selectively in recessed areas controlling stress and reducing resistance.
# Epitaxy (Epi) Modeling: 1. Introduction to Epitaxy Epitaxy is the controlled growth of a crystalline thin film on a crystalline substrate, where the deposited layer inherits the crystallographic orientation of the substrate. 1.1 Types of Epitaxy • Homoepitaxy • Same material deposited on substrate • Example: Silicon (Si) on Silicon (Si) • Maintains perfect lattice matching • Used for creating high-purity device layers • Heteroepitaxy • Different material deposited on substrate • Examples: • Gallium Arsenide (GaAs) on Silicon (Si) • Silicon Germanium (SiGe) on Silicon (Si) • Gallium Nitride (GaN) on Sapphire ($\text{Al}_2\text{O}_3$) • Introduces lattice mismatch and strain • Enables bandgap engineering 2. Epitaxy Methods 2.1 Chemical Vapor Deposition (CVD) / Vapor Phase Epitaxy (VPE) • Characteristics: • Most common method for silicon epitaxy • Operates at atmospheric or reduced pressure • Temperature range: $900°\text{C} - 1200°\text{C}$ • Common Precursors: • Silane: $\text{SiH}_4$ • Dichlorosilane: $\text{SiH}_2\text{Cl}_2$ (DCS) • Trichlorosilane: $\text{SiHCl}_3$ (TCS) • Silicon tetrachloride: $\text{SiCl}_4$ • Key Reactions: $$\text{SiH}_4 \xrightarrow{\Delta} \text{Si}_{(s)} + 2\text{H}_2$$ $$\text{SiH}_2\text{Cl}_2 \xrightarrow{\Delta} \text{Si}_{(s)} + 2\text{HCl}$$ 2.2 Molecular Beam Epitaxy (MBE) • Characteristics: • Ultra-high vacuum environment ($< 10^{-10}$ Torr) • Extremely precise thickness control (monolayer accuracy) • Lower growth temperatures than CVD • Slower growth rates: $\sim 1 \, \mu\text{m/hour}$ • Applications: • III-V compound semiconductors • Quantum well structures • Superlattices • Research and development 2.3 Metal-Organic CVD (MOCVD) • Characteristics: • Standard for compound semiconductors • Uses metal-organic precursors • Higher throughput than MBE • Common Precursors: • Trimethylgallium: $\text{Ga(CH}_3\text{)}_3$ (TMGa) • Trimethylaluminum: $\text{Al(CH}_3\text{)}_3$ (TMAl) • Ammonia: $\text{NH}_3$ 2.4 Atomic Layer Epitaxy (ALE) • Characteristics: • Self-limiting surface reactions • Digital control of film thickness • Excellent conformality • Growth rate: $\sim 1$ Å per cycle 3. Physics of Epi Modeling 3.1 Gas-Phase Transport The transport of precursor gases to the substrate surface involves multiple phenomena: • Governing Equations: • Continuity Equation: $$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0$$ • Navier-Stokes Equation: $$\rho \left( \frac{\partial \mathbf{v}}{\partial t} + \mathbf{v} \cdot \nabla \mathbf{v} \right) = -\nabla p + \mu \nabla^2 \mathbf{v} + \rho \mathbf{g}$$ • Species Transport Equation: $$\frac{\partial C_i}{\partial t} + \mathbf{v} \cdot \nabla C_i = D_i \nabla^2 C_i + R_i$$ Where: • $\rho$ = fluid density • $\mathbf{v}$ = velocity vector • $p$ = pressure • $\mu$ = dynamic viscosity • $C_i$ = concentration of species $i$ • $D_i$ = diffusion coefficient of species $i$ • $R_i$ = reaction rate term • Boundary Layer: • Stagnant gas layer above substrate • Thickness $\delta$ depends on flow conditions: $$\delta \propto \sqrt{\frac{\nu x}{u_\infty}}$$ Where: • $\nu$ = kinematic viscosity • $x$ = distance from leading edge • $u_\infty$ = free stream velocity 3.2 Surface Kinetics • Adsorption Process: • Physisorption (weak van der Waals forces) • Chemisorption (chemical bonding) • Langmuir Adsorption Isotherm: $$\theta = \frac{K \cdot P}{1 + K \cdot P}$$ Where: - $\theta$ = fractional surface coverage - $K$ = equilibrium constant - $P$ = partial pressure • Surface Diffusion: $$D_s = D_0 \exp\left(-\frac{E_d}{k_B T}\right)$$ Where: - $D_s$ = surface diffusion coefficient - $D_0$ = pre-exponential factor - $E_d$ = diffusion activation energy - $k_B$ = Boltzmann constant ($1.38 \times 10^{-23}$ J/K) - $T$ = absolute temperature 3.3 Crystal Growth Mechanisms • Step-Flow Growth (BCF Theory): • Atoms attach at step edges • Steps advance across terraces • Dominant at high temperatures • 2D Nucleation: • New layers nucleate on terraces • Occurs when step density is low • Creates rougher surfaces • Terrace-Ledge-Kink (TLK) Model: • Terrace: flat regions between steps • Ledge: step edges • Kink: incorporation sites at step edges 4. Mathematical Framework 4.1 Growth Rate Models 4.1.1 Reaction-Limited Regime At lower temperatures, surface reaction kinetics dominate: $$G = k_s \cdot C_s$$ Where the rate constant follows Arrhenius behavior: $$k_s = k_0 \exp\left(-\frac{E_a}{k_B T}\right)$$ Parameters: - $G$ = growth rate (nm/min or μm/hr) - $k_s$ = surface reaction rate constant - $C_s$ = surface concentration - $k_0$ = pre-exponential factor - $E_a$ = activation energy 4.1.2 Mass-Transport Limited Regime At higher temperatures, diffusion through the boundary layer limits growth: $$G = \frac{h_g}{N_s} \cdot (C_g - C_s)$$ Where: $$h_g = \frac{D}{\delta}$$ Parameters: - $h_g$ = mass transfer coefficient - $N_s$ = atomic density of solid ($\sim 5 \times 10^{22}$ atoms/cm³ for Si) - $C_g$ = gas phase concentration - $D$ = gas phase diffusivity - $\delta$ = boundary layer thickness 4.1.3 Combined Model (Grove Model) For the general case combining both regimes: $$G = \frac{h_g \cdot k_s}{N_s (h_g + k_s)} \cdot C_g$$ Or equivalently: $$\frac{1}{G} = \frac{N_s}{k_s \cdot C_g} + \frac{N_s}{h_g \cdot C_g}$$ 4.2 Strain in Heteroepitaxy 4.2.1 Lattice Mismatch $$f = \frac{a_s - a_f}{a_f}$$ Where: - $f$ = lattice mismatch (dimensionless) - $a_s$ = substrate lattice constant - $a_f$ = film lattice constant (relaxed) Example Values: | System | $a_f$ (Å) | $a_s$ (Å) | Mismatch $f$ | |--------|-----------|-----------|--------------| | Si on Si | 5.431 | 5.431 | 0% | | Ge on Si | 5.658 | 5.431 | -4.2% | | GaAs on Si | 5.653 | 5.431 | -4.1% | | InAs on GaAs | 6.058 | 5.653 | -7.2% | 4.2.2 In-Plane Strain For a coherently strained film: $$\epsilon_{\parallel} = \frac{a_s - a_f}{a_f} = f$$ The out-of-plane strain (for cubic materials): $$\epsilon_{\perp} = -\frac{2\nu}{1-\nu} \epsilon_{\parallel}$$ Where $\nu$ = Poisson's ratio 4.2.3 Critical Thickness (Matthews-Blakeslee) The critical thickness above which misfit dislocations form: $$h_c = \frac{b}{8\pi f (1+\nu)} \left[ \ln\left(\frac{h_c}{b}\right) + 1 \right]$$ Where: - $h_c$ = critical thickness - $b$ = Burgers vector magnitude ($\approx \frac{a}{\sqrt{2}}$ for 60° dislocations) - $f$ = lattice mismatch - $\nu$ = Poisson's ratio Approximate Solution: For small mismatch: $$h_c \approx \frac{b}{8\pi |f|}$$ 4.3 Dopant Incorporation 4.3.1 Segregation Model $$C_{film} = \frac{C_{gas}}{1 + k_{seg} \cdot (G/G_0)}$$ Where: - $C_{film}$ = dopant concentration in film - $C_{gas}$ = dopant concentration in gas phase - $k_{seg}$ = segregation coefficient - $G$ = growth rate - $G_0$ = reference growth rate 4.3.2 Dopant Profile with Segregation The surface concentration evolves as: $$C_s(t) = C_s^{eq} + (C_s(0) - C_s^{eq}) \exp\left(-\frac{G \cdot t}{\lambda}\right)$$ Where: - $\lambda$ = segregation length - $C_s^{eq}$ = equilibrium surface concentration 5. Modeling Approaches 5.1 Continuum Models • Scope: • Reactor-scale simulations • Temperature and flow field prediction • Species concentration profiles • Methods: • Computational Fluid Dynamics (CFD) • Finite Element Method (FEM) • Finite Volume Method (FVM) • Governing Physics: • Coupled heat, mass, and momentum transfer • Homogeneous and heterogeneous reactions • Radiation heat transfer 5.2 Feature-Scale Models • Applications: • Selective epitaxial growth (SEG) • Trench filling • Facet evolution • Key Phenomena: • Local loading effects: $$G_{local} = G_0 \cdot \left(1 - \alpha \cdot \frac{A_{exposed}}{A_{total}}\right)$$ • Orientation-dependent growth rates: $$\frac{G_{(110)}}{G_{(100)}} \approx 1.5 - 2.0$$ • Methods: • Level set methods • String methods • Cellular automata 5.3 Atomistic Models 5.3.1 Kinetic Monte Carlo (KMC) • Process Events: • Adsorption: rate $\propto P \cdot \exp(-E_{ads}/k_BT)$ • Surface diffusion: rate $\propto \exp(-E_{diff}/k_BT)$ • Desorption: rate $\propto \exp(-E_{des}/k_BT)$ • Incorporation: rate $\propto \exp(-E_{inc}/k_BT)$ • Master Equation: $$\frac{dP_i}{dt} = \sum_j \left( W_{ji} P_j - W_{ij} P_i \right)$$ Where: - $P_i$ = probability of state $i$ - $W_{ij}$ = transition rate from state $i$ to $j$ 5.3.2 Molecular Dynamics (MD) • Newton's Equations: $$m_i \frac{d^2 \mathbf{r}_i}{dt^2} = -\nabla_i U(\mathbf{r}_1, \mathbf{r}_2, ..., \mathbf{r}_N)$$ • Interatomic Potentials: • Tersoff potential (Si, C, Ge) • Stillinger-Weber potential (Si) • MEAM (metals and alloys) 5.3.3 Ab Initio / DFT • Kohn-Sham Equations: $$\left[ -\frac{\hbar^2}{2m} \nabla^2 + V_{eff}(\mathbf{r}) \right] \psi_i(\mathbf{r}) = \epsilon_i \psi_i(\mathbf{r})$$ • Applications: • Surface energies • Reaction barriers • Adsorption energies • Electronic structure 6. Specific Modeling Challenges 6.1 SiGe Epitaxy • Composition Control: $$x_{Ge} = \frac{R_{Ge}}{R_{Si} + R_{Ge}}$$ Where $R_{Si}$ and $R_{Ge}$ are partial growth rates • Strain Engineering: • Compressive strain in SiGe on Si • Enhances hole mobility • Critical thickness depends on Ge content: $$h_c(x) \approx \frac{0.5}{0.042 \cdot x} \text{ nm}$$ 6.2 Selective Epitaxy • Growth Selectivity: • Deposition only on exposed silicon • HCl addition for selectivity enhancement • Selectivity Condition: $$\frac{\text{Growth on Si}}{\text{Growth on SiO}_2} > 100:1$$ • Loading Effects: • Pattern-dependent growth rate • Faceting at mask edges 6.3 III-V on Silicon • Major Challenges: • Large lattice mismatch (4-8%) • Thermal expansion mismatch • Anti-phase domain boundaries (APDs) • High threading dislocation density • Mitigation Strategies: • Aspect ratio trapping (ART) • Graded buffer layers • Selective area growth • Dislocation filtering 7. Applications and Tools 7.1 Industrial Applications | Application | Material System | Key Parameters | |-------------|-----------------|----------------| | FinFET/GAA Source/Drain | Embedded SiGe, SiC | Strain, selectivity | | SiGe HBT | SiGe:C | Profile abruptness | | Power MOSFETs | SiC epitaxy | Defect density | | LEDs/Lasers | GaN, InGaN | Composition uniformity | | RF Devices | GaN on SiC | Buffer quality | 7.2 Simulation Software • Reactor-Scale CFD: • ANSYS Fluent • COMSOL Multiphysics • OpenFOAM • TCAD Process Simulation: • Synopsys Sentaurus Process • Silvaco Victory Process • Lumerical (for optoelectronics) • Atomistic Simulation: • LAMMPS (MD) • VASP, Quantum ESPRESSO (DFT) • Custom KMC codes 7.3 Key Metrics for Process Development • Uniformity: $$\text{Uniformity} = \frac{t_{max} - t_{min}}{2 \cdot t_{avg}} \times 100\%$$ • Defect Density: • Threading dislocations: target $< 10^6$ cm$^{-2}$ • Stacking faults: target $< 10^3$ cm$^{-2}$ • Profile Abruptness: • Dopant transition width $< 3$ nm/decade 8. Emerging Directions 8.1 Machine Learning Integration • Applications: • Surrogate models for process optimization • Real-time virtual metrology • Defect classification • Recipe optimization • Model Types: • Neural networks for growth rate prediction • Gaussian process regression for uncertainty quantification • Reinforcement learning for process control 8.2 Multi-Scale Modeling • Hierarchical Approach: ```text ┌─────────────────────────────────────────────┐ │ Ab Initio (DFT) │ │ ↓ Reaction rates, energies │ ├─────────────────────────────────────────────┤ │ Kinetic Monte Carlo │ │ ↓ Surface kinetics, morphology │ ├─────────────────────────────────────────────┤ │ Feature-Scale Models │ │ ↓ Local growth behavior │ ├─────────────────────────────────────────────┤ │ Reactor-Scale CFD │ │ ↓ Process conditions │ ├─────────────────────────────────────────────┤ │ Device Simulation │ └─────────────────────────────────────────────┘ ``` • Applications: • Surface energies • Reaction barriers • Adsorption energies • Electronic structure 8.3 Digital Twins • Components: • Real-time sensor data integration • Physics-based + ML hybrid models • Predictive maintenance • Closed-loop process control 8.4 New Material Systems • 2D Materials: • Graphene via CVD • Transition metal dichalcogenides (TMDs) • Van der Waals epitaxy • Ultra-Wide Bandgap: • $\beta$-Ga$_2$O$_3$ ($E_g \approx 4.8$ eV) • Diamond ($E_g \approx 5.5$ eV) • AlN ($E_g \approx 6.2$ eV) Constants and Conversions | Constant | Symbol | Value | |----------|--------|-------| | Boltzmann constant | $k_B$ | $1.381 \times 10^{-23}$ J/K | | Planck constant | $h$ | $6.626 \times 10^{-34}$ J·s | | Avogadro number | $N_A$ | $6.022 \times 10^{23}$ mol$^{-1}$ | | Si atomic density | $N_{Si}$ | $5.0 \times 10^{22}$ atoms/cm³ | | Si lattice constant | $a_{Si}$ | 5.431 Å |
One complete pass through the entire training dataset.
Epsilon parameter bounds maximum privacy loss in differential privacy.
Epsilon sampling truncates distribution at absolute probability threshold.
Model has same error rates across groups.
Equipment energy efficiency standards specify maximum power consumption for process tools.
Breakdowns causing downtime.
Verify equivariance properties.
3D-aware molecular generation.
Outputs transform correctly under symmetries.
Enterprise Resource Planning integrates business processes including procurement production inventory finance and human resources.
Error detection identifies failures or suboptimal actions during execution.
Accumulate compression errors.
# Semiconductor Manufacturing Error Propagation Mathematics ## 1. Fundamental Error Propagation Theory For a function $f(x_1, x_2, \ldots, x_n)$ where each variable $x_i$ has uncertainty $\sigma_i$, the propagated uncertainty follows: $$ \sigma_f^2 = \sum_{i=1}^{n} \left( \frac{\partial f}{\partial x_i} \right)^2 \sigma_i^2 + 2 \sum_{i < j} \frac{\partial f}{\partial x_i} \frac{\partial f}{\partial x_j} \, \text{cov}(x_i, x_j) $$ For **uncorrelated errors**, this simplifies to the **Root-Sum-of-Squares (RSS)** formula: $$ \sigma_f = \sqrt{\sum_{i=1}^{n} \left( \frac{\partial f}{\partial x_i} \right)^2 \sigma_i^2} $$ ### Applications in Semiconductor Manufacturing - **Critical Dimension (CD) variations**: Feature size deviations from target - **Overlay errors**: Misalignment between lithography layers - **Film thickness variations**: Deposition uniformity issues - **Doping concentration variations**: Implant dose and energy fluctuations ## 2. Process Chain Error Accumulation Semiconductor manufacturing involves hundreds of sequential process steps. Errors propagate through the chain in different modes: ### 2.1 Additive Error Accumulation Used for overlay alignment between layers: $$ E_{\text{total}} = \sum_{i=1}^{n} \varepsilon_i $$ $$ \sigma_{\text{total}}^2 = \sum_{i=1}^{n} \sigma_i^2 \quad \text{(if uncorrelated)} $$ ### 2.2 Multiplicative Error Accumulation Used for etch selectivity, deposition rates, and gain factors: $$ G_{\text{total}} = \prod_{i=1}^{n} G_i $$ $$ \frac{\sigma_G}{G} \approx \sqrt{\sum_{i=1}^{n} \left( \frac{\sigma_{G_i}}{G_i} \right)^2} $$ ### 2.3 Error Accumulation Modes - **Additive**: Errors sum directly (overlay, thickness) - **Multiplicative**: Errors compound through products (gain, selectivity) - **Compensating**: Rare cases where errors cancel - **Nonlinear interactions**: Complex dependencies requiring simulation ## 3. Hierarchical Variance Decomposition Total variation decomposes across spatial and temporal hierarchies: $$ \sigma_{\text{total}}^2 = \sigma_{\text{lot}}^2 + \sigma_{\text{wafer}}^2 + \sigma_{\text{die}}^2 + \sigma_{\text{within-die}}^2 $$ ### Variance Sources by Level | Level | Sources | |-------|---------| | **Lot-to-lot** | Incoming material, chamber conditioning, recipe drift | | **Wafer-to-wafer** | Slot position, thermal gradients, handling | | **Die-to-die** | Across-wafer uniformity, lens field distortion | | **Within-die** | Pattern density, microloading, proximity effects | ### Variance Component Analysis For $N$ measurements $y_{ijk}$ (lot $i$, wafer $j$, site $k$): $$ y_{ijk} = \mu + L_i + W_{ij} + \varepsilon_{ijk} $$ Where: - $\mu$ = grand mean - $L_i \sim N(0, \sigma_L^2)$ = lot effect - $W_{ij} \sim N(0, \sigma_W^2)$ = wafer effect - $\varepsilon_{ijk} \sim N(0, \sigma_\varepsilon^2)$ = residual ## 4. Yield Mathematics ### 4.1 Poisson Defect Model (Random Defects) $$ Y = e^{-D_0 A} $$ Where: - $D_0$ = defect density (defects/cm²) - $A$ = die area (cm²) ### 4.2 Negative Binomial Model (Clustered Defects) More realistic for actual manufacturing: $$ Y = \left( 1 + \frac{D_0 A}{\alpha} \right)^{-\alpha} $$ Where: - $\alpha$ = clustering parameter - $\alpha \to \infty$ recovers Poisson model - Smaller $\alpha$ = more clustering ### 4.3 Total Yield $$ Y_{\text{total}} = Y_{\text{defect}} \times Y_{\text{parametric}} $$ ### 4.4 Parametric Yield Integration over the multi-dimensional acceptable parameter space: $$ Y_{\text{parametric}} = \int \int \cdots \int_{\text{spec}} f(p_1, p_2, \ldots, p_n) \, dp_1 \, dp_2 \cdots dp_n $$ For Gaussian parameters with specs at $\pm k\sigma$: $$ Y_{\text{parametric}} \approx \left[ \text{erf}\left( \frac{k}{\sqrt{2}} \right) \right]^n $$ ## 5. Edge Placement Error (EPE) Critical metric at advanced nodes combining multiple error sources: $$ EPE^2 = \left( \frac{\Delta CD}{2} \right)^2 + OVL^2 + \left( \frac{LER}{2} \right)^2 $$ ### EPE Components - $\Delta CD$ = Critical dimension error - $OVL$ = Overlay error - $LER$ = Line edge roughness ### Extended EPE Model Including additional terms: $$ EPE^2 = \left( \frac{\Delta CD}{2} \right)^2 + OVL^2 + \left( \frac{LER}{2} \right)^2 + \sigma_{\text{mask}}^2 + \sigma_{\text{etch}}^2 $$ ## 6. Overlay Error Modeling Overlay at any point $(x, y)$ is modeled as: $$ OVL(x, y) = \vec{T} + R\theta + M \cdot \vec{r} + \text{HOT} $$ ### Overlay Components - $\vec{T} = (T_x, T_y)$ = Translation - $R\theta$ = Rotation - $M$ = Magnification - $\text{HOT}$ = Higher-Order Terms (lens distortions, wafer non-flatness) ### Overlay Budget (RSS) $$ OVL_{\text{budget}}^2 = OVL_{\text{tool}}^2 + OVL_{\text{process}}^2 + OVL_{\text{wafer}}^2 + OVL_{\text{mask}}^2 $$ ### 10-Parameter Overlay Model $$ \begin{aligned} dx &= T_x + R_x \cdot y + M_x \cdot x + N_x \cdot x \cdot y + \ldots \\ dy &= T_y + R_y \cdot x + M_y \cdot y + N_y \cdot x \cdot y + \ldots \end{aligned} $$ ## 7. Stochastic Effects in EUV Lithography At EUV wavelengths (13.5 nm), photon shot noise becomes fundamental. ### Photon Statistics Photons per pixel follow Poisson distribution: $$ N \sim \text{Poisson}(\bar{N}) $$ $$ \sigma_N = \sqrt{\bar{N}} $$ ### Relative Dose Fluctuation $$ \frac{\sigma_N}{\bar{N}} = \frac{1}{\sqrt{\bar{N}}} $$ ### Stochastic Failure Probability $$ P_{\text{fail}} \propto \exp\left( -\frac{E}{E_{\text{threshold}}} \right) $$ ### RLS Triangle Trade-off - **R**esolution - **L**ine edge roughness (LER) - **S**ensitivity (dose) $$ LER \propto \frac{1}{\sqrt{\text{Dose}}} \propto \frac{1}{\sqrt{N_{\text{photons}}}} $$ ## 8. Spatial Correlation Modeling Errors are spatially correlated. Modeled using variograms or correlation functions. ### Variogram $$ \gamma(h) = \frac{1}{2} E\left[ (Z(x+h) - Z(x))^2 \right] $$ ### Correlation Function $$ \rho(h) = \frac{\text{cov}(Z(x+h), Z(x))}{\text{var}(Z(x))} $$ ### Common Correlation Models | Model | Formula | |-------|---------| | **Exponential** | $\rho(h) = \exp\left( -\frac{h}{\lambda} \right)$ | | **Gaussian** | $\rho(h) = \exp\left( -\left( \frac{h}{\lambda} \right)^2 \right)$ | | **Spherical** | $\rho(h) = 1 - \frac{3h}{2\lambda} + \frac{h^3}{2\lambda^3}$ for $h \leq \lambda$ | ### Implications - Nearby devices are more correlated → better matching for analog - Correlation length $\lambda$ determines effective samples per die - Extreme values are less severe than independent variation suggests ## 9. Process Capability and Tail Statistics ### Process Capability Index $$ C_{pk} = \min \left[ \frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma} \right] $$ ### Defect Rates vs. Cpk (Gaussian) | $C_{pk}$ | PPM Outside Spec | Sigma Level | |----------|------------------|-------------| | 1.00 | ~2,700 | 3σ | | 1.33 | ~63 | 4σ | | 1.67 | ~0.6 | 5σ | | 2.00 | ~0.002 | 6σ | ### Extreme Value Statistics For $n$ independent samples from distribution $F(x)$, the maximum follows: $$ P(M_n \leq x) = [F(x)]^n $$ For large $n$, converges to Generalized Extreme Value (GEV): $$ G(x) = \exp\left\{ -\left[ 1 + \xi \left( \frac{x - \mu}{\sigma} \right) \right]^{-1/\xi} \right\} $$ ### Critical Insight For a chip with $10^{10}$ transistors: $$ P_{\text{chip fail}} = 1 - (1 - P_{\text{transistor fail}})^{10^{10}} \approx 10^{10} \cdot P_{\text{transistor fail}} $$ Even $P_{\text{transistor fail}} = 10^{-11}$ matters! ## 10. Sensitivity Analysis and Error Attribution ### Sensitivity Coefficient $$ S_i = \frac{\partial Y}{\partial \sigma_i} \times \frac{\sigma_i}{Y} $$ ### Variance Contribution $$ \text{Contribution}_i = \frac{\left( \frac{\partial f}{\partial x_i} \right)^2 \sigma_i^2}{\sigma_f^2} \times 100\% $$ ### Bayesian Root Cause Attribution $$ P(\text{cause} \mid \text{observation}) = \frac{P(\text{observation} \mid \text{cause}) \cdot P(\text{cause})}{P(\text{observation})} $$ ### Pareto Analysis Steps 1. Compute variance contribution from each source 2. Rank sources by contribution 3. Focus improvement on top contributors 4. Verify improvement with updated measurements ## 11. Monte Carlo Simulation Methods Due to complexity and nonlinearity, Monte Carlo methods are essential. ### Algorithm ``` FOR i = 1 to N_samples: 1. Sample process parameters: p_i ~ distributions 2. Simulate device/circuit: y_i = f(p_i) 3. Store result: Y[i] = y_i END FOR Compute statistics from Y[] ``` ### Key Advantages - Captures non-Gaussian behavior - Handles nonlinear transfer functions - Reveals correlations between outputs - Provides full distribution, not just moments ### Sample Size Requirements For estimating probability $p$ of rare events: $$ N \geq \frac{1 - p}{p \cdot \varepsilon^2} $$ Where $\varepsilon$ is the desired relative error. For $p = 10^{-6}$ with 10% error: $N \approx 10^8$ samples ## 12. Design-Technology Co-Optimization (DTCO) Error propagation feeds back into design rules: $$ \text{Design Margin} = k \times \sigma_{\text{total}} $$ Where $k$ depends on required yield and number of instances. ### Margin Calculation For yield $Y$ over $N$ instances: $$ k = \Phi^{-1}\left( Y^{1/N} \right) $$ Where $\Phi^{-1}$ is the inverse normal CDF. ### Example - Target yield: 99% - Number of gates: $10^9$ - Required: $k \approx 7\sigma$ per gate ## 13. Key Mathematical Insights ### Insight 1: RSS Dominates Budgets Uncorrelated errors add in quadrature: $$ \sigma_{\text{total}} = \sqrt{\sigma_1^2 + \sigma_2^2 + \cdots + \sigma_n^2} $$ **Implication**: Reducing the largest contributor gives the most improvement. ### Insight 2: Tails Matter More Than Means High-volume manufacturing lives in the $6\sigma$ tails where: - Gaussian assumptions break down - Extreme value statistics become essential - Rare events dominate yield loss ### Insight 3: Nonlinearity Creates Surprises Even Gaussian inputs produce non-Gaussian outputs: $$ Y = f(X) \quad \text{where } X \sim N(\mu, \sigma^2) $$ If $f$ is nonlinear, $Y$ is not Gaussian. ### Insight 4: Correlations Can Help or Hurt - **Positive correlations**: Worsen tail probabilities - **Negative correlations**: Can provide compensation - **Designed-in correlations**: Can dramatically improve yield ### Insight 5: Scaling Amplifies Relative Error $$ \text{Relative Error} = \frac{\sigma}{\text{Feature Size}} $$ A 1 nm variation: - 5% of 20 nm feature - 10% of 10 nm feature - 20% of 5 nm feature ## 14. Summary Equations ### Core Error Propagation $$ \sigma_f^2 = \sum_i \left( \frac{\partial f}{\partial x_i} \right)^2 \sigma_i^2 $$ ### Yield (Negative Binomial) $$ Y = \left( 1 + \frac{D_0 A}{\alpha} \right)^{-\alpha} $$ ### Edge Placement Error $$ EPE = \sqrt{\left( \frac{\Delta CD}{2} \right)^2 + OVL^2 + \left( \frac{LER}{2} \right)^2} $$ ### Process Capability $$ C_{pk} = \min \left[ \frac{USL - \mu}{3\sigma}, \frac{\mu - LSL}{3\sigma} \right] $$ ### Stochastic LER $$ LER \propto \frac{1}{\sqrt{N_{\text{photons}}}} $$
Educate personnel about ESD.
Eta sampling uses entropy-based threshold for adaptive truncation.
# Etch Film Stack Mathematical Modeling
1. Introduction and Problem Setup
A film stack in semiconductor manufacturing consists of multiple thin-film layers that must be precisely etched. Typical structures include:
- Photoresist (masking layer)
- Hard mask (SiN, SiO₂, or metal)
- Target film (material to be etched)
- Etch stop layer
- Substrate (Si wafer)
Objectives
- Remove target material at a controlled rate
- Stop precisely at interfaces (selectivity)
- Maintain profile fidelity (anisotropy, sidewall angle)
- Achieve uniformity across the wafer
2. Fundamental Etch Rate Models
2.1 Surface Reaction Kinetics
The Langmuir-Hinshelwood model captures competitive adsorption of reactive species:
$$
R = \frac{k \cdot \theta_A \cdot \theta_B}{\left(1 + K_A[A] + K_B[B]\right)^2}
$$
Where:
- $R$ = etch rate
- $k$ = reaction rate constant
- $\theta_A, \theta_B$ = fractional surface coverage of species A and B
- $K_A, K_B$ = adsorption equilibrium constants
- $[A], [B]$ = gas-phase concentrations
2.2 Temperature Dependence (Arrhenius)
$$
R = R_0 \exp\left(-\frac{E_a}{k_B T}\right)
$$
Where:
- $R_0$ = pre-exponential factor
- $E_a$ = activation energy
- $k_B$ = Boltzmann constant ($1.38 \times 10^{-23}$ J/K)
- $T$ = absolute temperature (K)
2.3 Ion-Enhanced Etching Model
Most plasma etching exhibits synergistic behavior—ions enhance chemical reactions:
$$
R_{total} = R_{chem} + R_{phys} + R_{synergy}
$$
The ion-enhanced component dominates in RIE/ICP:
$$
R_{ie} = Y(E, \theta) \cdot \Gamma_{ion} \cdot \Theta_{react}
$$
Where:
- $Y(E, \theta)$ = ion yield function (depends on energy $E$ and angle $\theta$)
- $\Gamma_{ion}$ = ion flux to surface (ions/cm²·s)
- $\Theta_{react}$ = fractional coverage of reactive species
3. Profile Evolution Mathematics
3.1 Level Set Method
The evolving surface is represented as the zero-contour of a level set function $\phi(\mathbf{x}, t)$:
$$
\frac{\partial \phi}{\partial t} + V(\mathbf{x}, t) \cdot |\nabla \phi| = 0
$$
Where:
- $\phi(\mathbf{x}, t)$ = level set function
- $V(\mathbf{x}, t)$ = local etch velocity (material and flux dependent)
- $\nabla \phi$ = gradient of the level set function
- $|\nabla \phi|$ = magnitude of the gradient
The surface normal is computed as:
$$
\hat{n} = \frac{\nabla \phi}{|\nabla \phi|}
$$
3.2 Visibility and Shadowing Integrals
For a point $\mathbf{p}$ inside a feature, the effective flux is:
$$
\Gamma(\mathbf{p}) = \int_{\Omega_{visible}} f(\hat{\Omega}) \cdot (\hat{\Omega} \cdot \hat{n}) \, d\Omega
$$
Where:
- $\Omega_{visible}$ = solid angle visible from point $\mathbf{p}$
- $f(\hat{\Omega})$ = ion angular distribution function (IADF)
- $\hat{n}$ = local surface normal
3.3 Ion Angular Distribution Function (IADF)
Typically modeled as a Gaussian:
$$
f(\theta) = \frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{\theta^2}{2\sigma^2}\right)
$$
Where:
- $\theta$ = angle from surface normal
- $\sigma$ = angular spread (related to $T_i / T_e$ ratio)
4. Multi-Layer Stack Modeling
4.1 Interface Tracking
For a stack with $n$ layers at depths $z_1, z_2, \ldots, z_n$:
$$
\frac{dz_{etch}}{dt} = -R_i(t)
$$
Where $i$ indicates the current material being etched. Material transitions occur when $z_{etch}$ crosses an interface boundary.
4.2 Selectivity Definition
$$
S_{A:B} = \frac{R_A}{R_B}
$$
Design requirements:
- Mask selectivity: $S_{target:mask} < 1$ (mask erodes slowly)
- Stop layer selectivity: $S_{target:stop} \gg 1$ (typically > 10:1)
4.3 Time-to-Clear Calculation
For layer thickness $d_i$ with etch rate $R_i$:
$$
t_{clear,i} = \frac{d_i}{R_i}
$$
Total etch time through multiple layers:
$$
t_{total} = \sum_{i=1}^{n} \frac{d_i}{R_i} + t_{overetch}
$$
5. Aspect Ratio Dependent Etching (ARDE)
5.1 General ARDE Model
Etch rate decreases with aspect ratio (AR = depth/width):
$$
R(AR) = R_0 \cdot f(AR)
$$
5.2 Neutral Transport Limited (Knudsen Regime)
$$
R(AR) = \frac{R_0}{1 + \alpha \cdot AR}
$$
The Knudsen diffusivity in a cylindrical feature:
$$
D_K = \frac{d}{3}\sqrt{\frac{8 k_B T}{\pi m}}
$$
Where:
- $d$ = feature diameter
- $m$ = molecular mass of neutral species
- $T$ = gas temperature
5.3 Clausing Factor for Molecular Flow
For a tube of length $L$ and radius $r$:
$$
W = \frac{1}{1 + \frac{3L}{8r}}
$$
5.4 Ion Angular Distribution Limited
$$
R(AR) = R_0 \cdot \int_0^{\theta_{max}(AR)} f(\theta) \cos\theta \, d\theta
$$
Where $\theta_{max}$ is the maximum acceptance angle:
$$
\theta_{max} = \arctan\left(\frac{w}{2h}\right)
$$
6. Plasma and Transport Modeling
6.1 Sheath Physics
Child-Langmuir Law (Collisionless Sheath)
$$
J = \frac{4\varepsilon_0}{9}\sqrt{\frac{2e}{M}}\frac{V_0^{3/2}}{d^2}
$$
Where:
- $J$ = ion current density
- $\varepsilon_0$ = permittivity of free space
- $e$ = electron charge
- $M$ = ion mass
- $V_0$ = sheath voltage
- $d$ = sheath thickness
Sheath Thickness (Matrix Sheath)
$$
s = \lambda_D \sqrt{\frac{2eV_0}{k_B T_e}}
$$
Where $\lambda_D$ is the Debye length:
$$
\lambda_D = \sqrt{\frac{\varepsilon_0 k_B T_e}{n_e e^2}}
$$
6.2 Ion Flux to Surface
At the sheath edge, ions reach the Bohm velocity:
$$
u_B = \sqrt{\frac{k_B T_e}{M_i}}
$$
Ion flux:
$$
\Gamma_i = n_s \cdot u_B = n_s \sqrt{\frac{k_B T_e}{M_i}}
$$
Where $n_s \approx 0.61 \cdot n_0$ (sheath edge density).
6.3 Neutral Species Balance
Continuity equation for neutral species:
$$
\nabla \cdot (D \nabla n) + \sum_j k_j n_j n_e - k_{loss} n = 0
$$
Where:
- $D$ = diffusion coefficient
- $k_j$ = generation rate constants
- $k_{loss}$ = surface loss rate
7. Feature-Scale Monte Carlo Methods
7.1 Algorithm Overview
1. Sample particles from flux distributions at feature entrance
2. Track trajectories (ballistic for ions, random walk for neutrals)
3. Surface interactions: React, reflect, or stick with probabilities
4. Accumulate statistics for local etch rates
5. Advance surface using accumulated rates
7.2 Reflection Probability Models
Specular Reflection
$$
\theta_{out} = \theta_{in}
$$
Diffuse (Cosine) Reflection
$$
P(\theta_{out}) \propto \cos(\theta_{out})
$$
Mixed Model
$$
P_{reflect} = (1 - s) \cdot P_{specular} + s \cdot P_{diffuse}
$$
Where $s$ is the scattering coefficient.
7.3 Sticking Coefficient Model
$$
\gamma = \gamma_0 \cdot (1 - \Theta)^n
$$
Where:
- $\gamma_0$ = bare surface sticking coefficient
- $\Theta$ = surface coverage
- $n$ = reaction order
8. Loading Effects
8.1 Macroloading (Wafer Scale)
$$
R = \frac{R_0}{1 + \beta \cdot A_{exposed}}
$$
Where:
- $A_{exposed}$ = total exposed etchable area
- $\beta$ = loading coefficient
8.2 Microloading (Pattern Scale)
Local etch rate depends on pattern density $\rho$:
$$
R_{local} = R_0 \cdot \left(1 - \gamma \cdot \rho\right)
$$
Dense patterns etch slower due to local reactant depletion.
8.3 Reactive Species Depletion Model
For a feature with area $A$ in a cell of area $A_{cell}$:
$$
R = R_0 \cdot \frac{1}{1 + \frac{k_{etch} \cdot A}{k_{supply} \cdot A_{cell}}}
$$
9. Atomic Layer Etching (ALE) Models
9.1 Two-Step Process
Step 1 - Surface Modification:
$$
A_{(g)} + S_{(s)} \rightarrow A\text{-}S_{(s)}
$$
Step 2 - Removal:
$$
A\text{-}S_{(s)} + B_{(g/ion)} \rightarrow \text{volatile products}
$$
9.2 Self-Limiting Kinetics
Surface coverage during modification:
$$
\theta_{mod}(t) = 1 - \exp\left(-\Gamma_A \cdot s_A \cdot t\right)
$$
Where:
- $\Gamma_A$ = flux of modifying species
- $s_A$ = sticking probability
- $t$ = exposure time
9.3 Etch Per Cycle (EPC)
$$
EPC = \theta_{sat} \cdot \delta_{ML}
$$
Where:
- $\theta_{sat}$ = saturation coverage (ideally 1.0)
- $\delta_{ML}$ = monolayer thickness (typically 0.1–0.5 nm)
9.4 Synergy Factor
$$
S_f = \frac{EPC_{ALE}}{EPC_{step1} + EPC_{step2}}
$$
Values $S_f > 1$ indicate synergistic enhancement.
10. Process Window Modeling
10.1 Response Surface Methodology
$$
CD = \beta_0 + \sum_{i=1}^{k} \beta_i x_i + \sum_{i=1}^{k} \beta_{ii} x_i^2 + \sum_{i
# Semiconductor Manufacturing Process: Etch Modeling ## 1. Introduction Etch modeling is one of the most complex and critical areas in semiconductor fabrication simulation. As device geometries shrink below $10\ \text{nm}$ and structures become increasingly three-dimensional, accurate prediction of etch behavior becomes essential for: - **Process Development**: Predict outcomes before costly fab experiments - **Yield Optimization**: Understand how variations propagate to device performance - **OPC/EPC Extension**: Compensate for etch-induced pattern distortions in mask design - **Design-Technology Co-Optimization (DTCO)**: Feed process effects back into design rules - **Virtual Metrology**: Predict wafer results from equipment sensor data in real time ## 2. Fundamentals of Etching ### 2.1 What is Etching? Etching selectively removes material from a wafer to transfer lithographically defined patterns into underlying layers—silicon, oxides, nitrides, metals, or complex stacks. ### 2.2 Types of Etching - **Wet Etching** - Uses liquid chemicals (acids, bases, solvents) - Typically isotropic (etches equally in all directions) - Etch rate follows Arrhenius relationship: $$ R = A \exp\left(-\frac{E_a}{k_B T}\right) $$ where: - $R$ = etch rate - $A$ = pre-exponential factor - $E_a$ = activation energy - $k_B$ = Boltzmann constant ($1.381 \times 10^{-23}\ \text{J/K}$) - $T$ = temperature (K) - **Dry/Plasma Etching** - Uses ionized gases (plasma) - Anisotropic (directional) - Dominant for modern processes ($< 100\ \text{nm}$ nodes) ### 2.3 Plasma Etching Mechanisms 1. **Physical Sputtering** - Ion bombardment physically removes atoms - Sputter yield $Y$ depends on ion energy $E_i$: $$ Y(E_i) = A \left( \sqrt{E_i} - \sqrt{E_{th}} \right) $$ where $E_{th}$ is the threshold energy 2. **Chemical Etching** - Reactive species form volatile products - Example: Silicon etching with fluorine $$ \text{Si} + 4\text{F} \rightarrow \text{SiF}_4 \uparrow $$ 3. **Ion-Enhanced Etching** - Synergy between ion bombardment and chemical reactions - Etch yield enhancement factor: $$ \eta = \frac{Y_{ion+chem}}{Y_{ion} + Y_{chem}} $$ ## 3. Hierarchy of Etch Models ### 3.1 Empirical Models Data-driven, fast, used in production: - **Etch Bias Models** - Simple offset correction: $$ CD_{final} = CD_{litho} + \Delta_{etch} $$ - Pattern-dependent bias: $$ \Delta_{etch} = f(\text{pitch}, \text{density}, \text{orientation}) $$ - **Etch Proximity Correction (EPC)** - Kernel-based convolution: $$ \Delta(x,y) = \iint K(x-x', y-y') \cdot I(x', y') \, dx' dy' $$ - Where $K$ is the etch kernel and $I$ is the pattern intensity - **Machine Learning Models** - Neural networks trained on metrology data - Gaussian process regression for uncertainty quantification ### 3.2 Feature-Scale Models Semi-empirical, balance speed and physics: - **String/Segment Models** - Represent edges as connected nodes - Each node moves according to local etch rate vector: $$ \frac{d\vec{r}_i}{dt} = R(\theta_i, \Gamma_{ion}, \Gamma_{n}) \cdot \hat{n}_i $$ - Where: - $\vec{r}_i$ = position of node $i$ - $\theta_i$ = local surface angle - $\Gamma_{ion}$, $\Gamma_n$ = ion and neutral fluxes - $\hat{n}_i$ = surface normal - **Level-Set Methods** - Track surface as zero-contour of signed distance function $\phi$: $$ \frac{\partial \phi}{\partial t} + R(\vec{x}) |\nabla \phi| = 0 $$ - Handles topology changes naturally (merging, splitting) - **Cell-Based/Voxel Methods** - Discretize feature volume into cells - Apply probabilistic removal rules: $$ P_{remove} = 1 - \exp\left( -\sum_j \sigma_j \Gamma_j \Delta t \right) $$ - Where $\sigma_j$ is the reaction cross-section for species $j$ ### 3.3 Physics-Based Plasma Models Capture reactor-scale phenomena: - **Plasma Bulk** - Electron energy distribution function (EEDF) - Boltzmann equation: $$ \frac{\partial f}{\partial t} + \vec{v} \cdot \nabla f + \frac{q\vec{E}}{m} \cdot \nabla_v f = \left( \frac{\partial f}{\partial t} \right)_{coll} $$ - **Sheath Physics** - Child-Langmuir law for ion flux: $$ J_{ion} = \frac{4\epsilon_0}{9} \sqrt{\frac{2e}{M}} \frac{V^{3/2}}{d^2} $$ - Ion angular distribution at wafer surface - **Transport** - Species continuity: $$ \frac{\partial n_i}{\partial t} + \nabla \cdot (n_i \vec{v}_i) = S_i - L_i $$ - Where $S_i$ and $L_i$ are source and loss terms ### 3.4 Atomistic Models Fundamental understanding, computationally expensive: - **Molecular Dynamics (MD)** - Newton's equations for all atoms: $$ m_i \frac{d^2 \vec{r}_i}{dt^2} = -\nabla_i U(\{\vec{r}\}) $$ - Interatomic potentials: Tersoff, Stillinger-Weber, ReaxFF - **Monte Carlo (MC) Methods** - Statistical sampling of ion trajectories - Binary collision approximation (BCA) for high energies - Acceptance probability: $$ P = \min\left(1, \exp\left(-\frac{\Delta E}{k_B T}\right)\right) $$ - **Kinetic Monte Carlo (KMC)** - Sample reactive events with rates $k_i$: $$ k_i = \nu_0 \exp\left(-\frac{E_{a,i}}{k_B T}\right) $$ - Event selection: $\sum_{j < i} k_j < r \cdot K_{tot} \leq \sum_{j \leq i} k_j$ ## 4. Key Physical Phenomena ### 4.1 Anisotropy Ratio of vertical to lateral etch rate: $$ A = 1 - \frac{R_{lateral}}{R_{vertical}} $$ - $A = 1$: Perfectly anisotropic (vertical sidewalls) - $A = 0$: Perfectly isotropic **Mechanisms for achieving anisotropy:** - Directional ion bombardment - Sidewall passivation (polymer deposition) - Low pressure operation (fewer collisions → more directional ions) - Ion angular distribution characterized by: $$ f(\theta) \propto \cos^n(\theta) $$ where higher $n$ indicates more directional flux ### 4.2 Selectivity Ratio of etch rates between materials: $$ S_{A/B} = \frac{R_A}{R_B} $$ - **Mask selectivity**: Target material vs. photoresist/hard mask - **Stop layer selectivity**: Target material vs. underlying layer Example selectivities required: | Process | Selectivity Required | |---------|---------------------| | Oxide/Nitride | $> 20:1$ | | Poly-Si/Oxide | $> 50:1$ | | Si/SiGe (channel release) | $> 100:1$ | ### 4.3 Loading Effects #### Microloading Local depletion of reactive species in dense pattern regions: $$ R_{dense} = R_0 \cdot \frac{1}{1 + \beta \cdot \rho_{local}} $$ where: - $R_0$ = etch rate in isolated feature - $\beta$ = loading coefficient - $\rho_{local}$ = local pattern density #### Macroloading Wafer-scale depletion: $$ R = R_0 \cdot \left(1 - \alpha \cdot A_{exposed}\right) $$ where $A_{exposed}$ is total exposed area fraction ### 4.4 Aspect Ratio Dependent Etching (ARDE) Deep, narrow features etch slower due to transport limitations: $$ R(AR) = R_0 \cdot \exp\left(-\frac{AR}{AR_0}\right) $$ where $AR = \text{depth}/\text{width}$ **Physical mechanisms:** 1. **Ion Shadowing** - Geometric shadowing angle: $$ \theta_{shadow} = \arctan\left(\frac{1}{AR}\right) $$ 2. **Neutral Transport** - Knudsen diffusion coefficient: $$ D_K = \frac{d}{3} \sqrt{\frac{8 k_B T}{\pi m}} $$ - where $d$ is feature diameter 3. **Byproduct Redeposition** - Sticking probability affects escape ### 4.5 Profile Anomalies | Phenomenon | Description | Cause | |------------|-------------|-------| | **Bowing** | Lateral bulge in sidewall | Ion scattering off sidewalls | | **Notching** | Lateral etching at interface | Charge buildup on insulators | | **Microtrenching** | Deep spots at corners | Ion reflection at feature bottom | | **Footing** | Undercut at bottom | Isotropic chemical component | | **Tapering** | Non-vertical sidewalls | Insufficient passivation | ## 5. Mathematical Foundations ### 5.1 Surface Evolution Equation General form for surface height $h(x,y,t)$: $$ \frac{\partial h}{\partial t} = -R_0 \cdot V(\theta) \cdot \sqrt{1 + |\nabla h|^2} $$ where: - $R_0$ = baseline etch rate - $V(\theta)$ = visibility/flux function - $\theta = \arctan(|\nabla h|)$ ### 5.2 Ion Angular Distribution At wafer surface, ion flux angular distribution: $$ \Gamma(\theta, \phi) = \Gamma_0 \cdot f(\theta) \cdot g(E) $$ Common models: - **Gaussian distribution:** $$ f(\theta) = \frac{1}{\sqrt{2\pi}\sigma_\theta} \exp\left(-\frac{\theta^2}{2\sigma_\theta^2}\right) $$ - **Thompson distribution** (for sputtered neutrals): $$ f(E) \propto \frac{E}{(E + E_b)^3} $$ ### 5.3 Visibility Calculation For a point on the surface, visibility to incoming flux: $$ V(\vec{r}) = \frac{1}{2\pi} \int_0^{2\pi} \int_0^{\theta_{max}(\phi)} f(\theta) \sin\theta \cos\theta \, d\theta \, d\phi $$ where $\theta_{max}(\phi)$ is determined by local geometry (shadowing) ### 5.4 Surface Reaction Kinetics Langmuir-Hinshelwood mechanism: $$ R = k \cdot \theta_A \cdot \theta_B $$ where surface coverages follow: $$ \frac{d\theta_i}{dt} = s_i \Gamma_i (1 - \theta_{total}) - k_d \theta_i - k_r \theta_i $$ - $s_i$ = sticking coefficient - $k_d$ = desorption rate - $k_r$ = reaction rate ### 5.5 Plasma-Surface Interaction Yield Ion-enhanced etch yield: $$ Y_{etch} = Y_0 + Y_1 \cdot \sqrt{E_{ion} - E_{th}} + Y_{chem} \cdot \frac{\Gamma_n}{\Gamma_{ion}} $$ where: - $Y_0$ = chemical baseline yield - $Y_1$ = ion enhancement coefficient - $E_{th}$ = threshold energy (~15-50 eV typically) - $Y_{chem}$ = chemical enhancement factor ## 6. Modern Modeling Approaches ### 6.1 Hybrid Multi-Scale Frameworks Coupling different scales: ``` - ┌─────────────────────────────────────────────────────────────┐ │ REACTOR SCALE │ │ Plasma simulation (fluid or PIC) │ │ Output: Ion/neutral fluxes, energies, angular dist. │ └────────────────────────┬────────────────────────────────────┘ │ Boundary conditions ▼ ┌─────────────────────────────────────────────────────────────┐ │ FEATURE SCALE │ │ Level-set or Monte Carlo │ │ Output: Profile evolution, etch rates │ └────────────────────────┬────────────────────────────────────┘ │ Parameter extraction ▼ ┌─────────────────────────────────────────────────────────────┐ │ ATOMISTIC SCALE │ │ MD/KMC simulations │ │ Output: Sticking coefficients, sputter yields │ └─────────────────────────────────────────────────────────────┘ ``` ### 6.2 Machine Learning Integration - **Surrogate Models** - Train neural network on physics simulation outputs: $$ \hat{y} = f_{NN}(\vec{x}; \vec{w}) $$ - Loss function: $$ \mathcal{L} = \frac{1}{N} \sum_{i=1}^{N} \|y_i - \hat{y}_i\|^2 + \lambda \|\vec{w}\|^2 $$ - **Physics-Informed Neural Networks (PINNs)** - Embed physics constraints in loss: $$ \mathcal{L}_{total} = \mathcal{L}_{data} + \alpha \mathcal{L}_{physics} $$ - Where $\mathcal{L}_{physics}$ enforces governing equations - **Virtual Metrology** - Predict CD, profile from chamber sensors: $$ CD_{predicted} = g(P, T, V_{bias}, \text{OES}, ...) $$ ### 6.3 Computational Lithography Integration Major EDA tools couple lithography + etch: 1. Litho simulation → Resist profile $h_R(x,y)$ 2. Etch simulation → Final pattern $h_F(x,y)$ 3. Combined model: $$ CD_{final} = CD_{design} + \Delta_{OPC} + \Delta_{litho} + \Delta_{etch} $$ ## 7. Challenges at Advanced Nodes ### 7.1 FinFET / Gate-All-Around (GAA) - **Fin Etch** - Sidewall angle uniformity: $90° \pm 1°$ - Width control: $\pm 1\ \text{nm}$ at $W_{fin} < 10\ \text{nm}$ - **Channel Release** - Selective SiGe vs. Si etching - Required selectivity: $> 100:1$ - Etch rate: $$ R_{SiGe} \gg R_{Si} $$ - **Inner Spacer Formation** - Isotropic lateral etch in confined geometry - Depth control: $\pm 0.5\ \text{nm}$ ### 7.2 3D NAND Extreme aspect ratio challenges: | Generation | Layers | Aspect Ratio | |------------|--------|--------------| | 96L | 96 | ~60:1 | | 128L | 128 | ~80:1 | | 176L | 176 | ~100:1 | | 232L+ | 232+ | ~150:1 | Critical issues: - ARDE variation across depth - Bowing control - Twisting in elliptical holes ### 7.3 EUV Patterning - Very thin resists: $< 40\ \text{nm}$ - Hard mask stacks with multiple layers - LER/LWR amplification: $$ LER_{final} = \sqrt{LER_{litho}^2 + LER_{etch}^2} $$ - Target: $LER < 1.2\ \text{nm}$ ($3\sigma$) ### 7.4 Stochastic Effects At small dimensions, statistical fluctuations dominate: $$ \sigma_{CD} \propto \frac{1}{\sqrt{N_{events}}} $$ where $N_{events}$ = number of etching events per feature ## 8. Industry Tools ### 8.1 Commercial Software | Category | Tools | |----------|-------| | **TCAD/Process** | Synopsys Sentaurus Process, Silvaco Victory Process | | **Virtual Fab** | Coventor SEMulator3D | | **Equipment Vendor** | Lam Research, Applied Materials (proprietary) | | **Computational Litho** | Synopsys S-Litho, Siemens Calibre | ### 8.2 Research Tools - **MCFPM** (Monte Carlo Feature Profile Model) - University of Illinois - **LAMMPS** - Molecular dynamics - **SPARTA** - Direct Simulation Monte Carlo - **OpenFOAM** - Plasma fluid modeling ## 9. Future Directions ### 9.1 Digital Twins Real-time chamber models for closed-loop process control: $$ \vec{u}_{control}(t) = \mathcal{K} \left[ y_{target} - y_{model}(t) \right] $$ ### 9.2 Atomistic-Continuum Coupling Seamless multi-scale simulation using: - Adaptive mesh refinement - Concurrent coupling methods - Machine-learned interscale bridging ### 9.3 New Materials Modeling requirements for: - 2D materials (graphene, MoS$_2$, WS$_2$) - High-$\kappa$ dielectrics - Ferroelectrics (HfZrO) - High-mobility channels (InGaAs, Ge) ### 9.4 Uncertainty Quantification Predicting distributions, not just means: $$ P(CD) = \int P(CD | \vec{\theta}) P(\vec{\theta}) d\vec{\theta} $$ Key metrics: - Process capability: $C_{pk} = \frac{\min(USL - \mu, \mu - LSL)}{3\sigma}$ - Target: $C_{pk} > 1.67$ for production ## Summary Etch modeling spans from atomic-scale surface reactions to reactor-scale plasma physics to fab-level empirical correlations. The art lies in choosing the right abstraction level: | Application | Model Type | Speed | Accuracy | |-------------|------------|-------|----------| | Production OPC/EPC | Empirical/ML | ★★★★★ | ★★☆☆☆ | | Process Development | Feature-scale | ★★★☆☆ | ★★★★☆ | | Mechanism Research | Atomistic MD/MC | ★☆☆☆☆ | ★★★★★ | | Equipment Design | Plasma + Feature | ★★☆☆☆ | ★★★★☆ | As geometries shrink and structures become more 3D, accurate etch modeling becomes essential for first-time-right process development and continued yield improvement.
# Mathematical Modeling of Plasma Etching in Semiconductor Manufacturing ## Introduction Plasma etching is a critical process in semiconductor manufacturing where reactive gases are ionized to create a plasma, which selectively removes material from a wafer surface. The mathematical modeling of this process spans multiple physics domains: - **Electromagnetic theory** — RF power coupling and field distributions - **Statistical mechanics** — Particle distributions and kinetic theory - **Reaction kinetics** — Gas-phase and surface chemistry - **Transport phenomena** — Species diffusion and convection - **Surface science** — Etch mechanisms and selectivity ## Foundational Plasma Physics ### Boltzmann Transport Equation The most fundamental description of plasma behavior is the **Boltzmann transport equation**, governing the evolution of the particle velocity distribution function $f(\mathbf{r}, \mathbf{v}, t)$: $$ \frac{\partial f}{\partial t} + \mathbf{v} \cdot \nabla f + \frac{\mathbf{F}}{m} \cdot \nabla_v f = \left(\frac{\partial f}{\partial t}\right)_{\text{collision}} $$ **Where:** - $f(\mathbf{r}, \mathbf{v}, t)$ — Velocity distribution function - $\mathbf{v}$ — Particle velocity - $\mathbf{F}$ — External force (electromagnetic) - $m$ — Particle mass - RHS — Collision integral ### Fluid Moment Equations For computational tractability, velocity moments of the Boltzmann equation yield fluid equations: #### Continuity Equation (Mass Conservation) $$ \frac{\partial n}{\partial t} + \nabla \cdot (n\mathbf{u}) = S - L $$ **Where:** - $n$ — Species number density $[\text{m}^{-3}]$ - $\mathbf{u}$ — Drift velocity $[\text{m/s}]$ - $S$ — Source term (generation rate) - $L$ — Loss term (consumption rate) #### Momentum Conservation $$ \frac{\partial (nm\mathbf{u})}{\partial t} + \nabla \cdot (nm\mathbf{u}\mathbf{u}) + \nabla p = nq(\mathbf{E} + \mathbf{u} \times \mathbf{B}) - nm\nu_m \mathbf{u} $$ **Where:** - $p = nk_BT$ — Pressure - $q$ — Particle charge - $\mathbf{E}$, $\mathbf{B}$ — Electric and magnetic fields - $\nu_m$ — Momentum transfer collision frequency $[\text{s}^{-1}]$ #### Energy Conservation $$ \frac{\partial}{\partial t}\left(\frac{3}{2}nk_BT\right) + \nabla \cdot \mathbf{q} + p\nabla \cdot \mathbf{u} = Q_{\text{heating}} - Q_{\text{loss}} $$ **Where:** - $k_B = 1.38 \times 10^{-23}$ J/K — Boltzmann constant - $\mathbf{q}$ — Heat flux vector - $Q_{\text{heating}}$ — Power input (Joule heating, stochastic heating) - $Q_{\text{loss}}$ — Energy losses (collisions, radiation) ## Electromagnetic Field Coupling ### Maxwell's Equations For capacitively coupled plasma (CCP) and inductively coupled plasma (ICP) reactors: $$ \nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t} $$ $$ \nabla \times \mathbf{H} = \mathbf{J} + \frac{\partial \mathbf{D}}{\partial t} $$ $$ \nabla \cdot \mathbf{D} = \rho $$ $$ \nabla \cdot \mathbf{B} = 0 $$ ### Plasma Conductivity The plasma current density couples through the complex conductivity: $$ \mathbf{J} = \sigma \mathbf{E} $$ For RF plasmas, the **complex conductivity** is: $$ \sigma = \frac{n_e e^2}{m_e(\nu_m + i\omega)} $$ **Where:** - $n_e$ — Electron density - $e = 1.6 \times 10^{-19}$ C — Elementary charge - $m_e = 9.1 \times 10^{-31}$ kg — Electron mass - $\omega$ — RF angular frequency - $\nu_m$ — Electron-neutral collision frequency ### Power Deposition Time-averaged power density deposited into the plasma: $$ P = \frac{1}{2}\text{Re}(\mathbf{J} \cdot \mathbf{E}^*) $$ **Typical values:** - CCP: $0.1 - 1$ W/cm³ - ICP: $0.5 - 5$ W/cm³ ## Plasma Sheath Physics The sheath is a thin, non-neutral region at the plasma-wafer interface that accelerates ions toward the surface, enabling anisotropic etching. ### Bohm Criterion Minimum ion velocity entering the sheath: $$ u_i \geq u_B = \sqrt{\frac{k_B T_e}{M_i}} $$ **Where:** - $u_B$ — Bohm velocity - $T_e$ — Electron temperature (typically 2–5 eV) - $M_i$ — Ion mass **Example:** For Ar⁺ ions with $T_e = 3$ eV: $$ u_B = \sqrt{\frac{3 \times 1.6 \times 10^{-19}}{40 \times 1.67 \times 10^{-27}}} \approx 2.7 \text{ km/s} $$ ### Child-Langmuir Law For a collisionless sheath, the ion current density is: $$ J = \frac{4\varepsilon_0}{9}\sqrt{\frac{2e}{M_i}} \cdot \frac{V_s^{3/2}}{d^2} $$ **Where:** - $\varepsilon_0 = 8.85 \times 10^{-12}$ F/m — Vacuum permittivity - $V_s$ — Sheath voltage drop (typically 10–500 V) - $d$ — Sheath thickness ### Sheath Thickness The sheath thickness scales as: $$ d \approx \lambda_D \left(\frac{2eV_s}{k_BT_e}\right)^{3/4} $$ **Where** the Debye length is: $$ \lambda_D = \sqrt{\frac{\varepsilon_0 k_B T_e}{n_e e^2}} $$ ### Ion Angular Distribution Ions arrive at the wafer with an angular distribution: $$ f(\theta) \propto \exp\left(-\frac{\theta^2}{2\sigma^2}\right) $$ **Where:** $$ \sigma \approx \arctan\left(\sqrt{\frac{k_B T_i}{eV_s}}\right) $$ **Typical values:** $\sigma \approx 2°–5°$ for high-bias conditions. ## Electron Energy Distribution Function ### Non-Maxwellian Distributions In low-pressure plasmas (1–100 mTorr), the EEDF deviates from Maxwellian. #### Two-Term Approximation The EEDF is expanded as: $$ f(\varepsilon, \theta) = f_0(\varepsilon) + f_1(\varepsilon)\cos\theta $$ The isotropic part $f_0$ satisfies: $$ \frac{d}{d\varepsilon}\left[\varepsilon D \frac{df_0}{d\varepsilon} + \left(V + \frac{\varepsilon\nu_{\text{inel}}}{\nu_m}\right)f_0\right] = 0 $$ #### Common Distribution Functions | Distribution | Functional Form | Applicability | |-------------|-----------------|---------------| | **Maxwellian** | $f(\varepsilon) \propto \sqrt{\varepsilon} \exp\left(-\frac{\varepsilon}{k_BT_e}\right)$ | High pressure, collisional | | **Druyvesteyn** | $f(\varepsilon) \propto \sqrt{\varepsilon} \exp\left(-\left(\frac{\varepsilon}{k_BT_e}\right)^2\right)$ | Elastic collisions dominant | | **Bi-Maxwellian** | Sum of two Maxwellians | Hot tail population | ### Generalized Form $$ f(\varepsilon) \propto \sqrt{\varepsilon} \cdot \exp\left[-\left(\frac{\varepsilon}{k_BT_e}\right)^x\right] $$ - $x = 1$ → Maxwellian - $x = 2$ → Druyvesteyn ## Plasma Chemistry and Reaction Kinetics ### Species Balance Equation For species $i$: $$ \frac{\partial n_i}{\partial t} + \nabla \cdot \mathbf{\Gamma}_i = \sum_j R_j $$ **Where:** - $\mathbf{\Gamma}_i$ — Species flux - $R_j$ — Reaction rates ### Electron-Impact Rate Coefficients Rate coefficients are calculated by integration over the EEDF: $$ k = \int_0^\infty \sigma(\varepsilon) v(\varepsilon) f(\varepsilon) \, d\varepsilon = \langle \sigma v \rangle $$ **Where:** - $\sigma(\varepsilon)$ — Energy-dependent cross-section $[\text{m}^2]$ - $v(\varepsilon) = \sqrt{2\varepsilon/m_e}$ — Electron velocity - $f(\varepsilon)$ — Normalized EEDF ### Heavy-Particle Reactions Arrhenius kinetics for neutral reactions: $$ k = A T^n \exp\left(-\frac{E_a}{k_BT}\right) $$ **Where:** - $A$ — Pre-exponential factor - $n$ — Temperature exponent - $E_a$ — Activation energy ### Example: SF₆/O₂ Plasma Chemistry #### Electron-Impact Reactions | Reaction | Type | Threshold | |----------|------|-----------| | $e + \text{SF}_6 \rightarrow \text{SF}_5 + \text{F} + e$ | Dissociation | ~10 eV | | $e + \text{SF}_6 \rightarrow \text{SF}_6^-$ | Attachment | ~0 eV | | $e + \text{SF}_6 \rightarrow \text{SF}_5^+ + \text{F} + 2e$ | Ionization | ~16 eV | | $e + \text{O}_2 \rightarrow \text{O} + \text{O} + e$ | Dissociation | ~6 eV | #### Gas-Phase Reactions - $\text{F} + \text{O} \rightarrow \text{FO}$ (reduces F atom density) - $\text{SF}_5 + \text{F} \rightarrow \text{SF}_6$ (recombination) - $\text{O} + \text{CF}_3 \rightarrow \text{COF}_2 + \text{F}$ (polymer removal) #### Surface Reactions - $\text{F} + \text{Si}(s) \rightarrow \text{SiF}_{(\text{ads})}$ - $\text{SiF}_{(\text{ads})} + 3\text{F} \rightarrow \text{SiF}_4(g)$ (volatile product) ## Transport Phenomena ### Drift-Diffusion Model For charged species, the flux is: $$ \mathbf{\Gamma} = \pm \mu n \mathbf{E} - D \nabla n $$ **Where:** - Upper sign: positive ions - Lower sign: electrons - $\mu$ — Mobility $[\text{m}^2/(\text{V}\cdot\text{s})]$ - $D$ — Diffusion coefficient $[\text{m}^2/\text{s}]$ ### Einstein Relation Connects mobility and diffusion: $$ D = \frac{\mu k_B T}{e} $$ ### Ambipolar Diffusion When quasi-neutrality holds ($n_e \approx n_i$): $$ D_a = \frac{\mu_i D_e + \mu_e D_i}{\mu_i + \mu_e} \approx D_i\left(1 + \frac{T_e}{T_i}\right) $$ Since $T_e \gg T_i$ typically: $D_a \approx D_i (1 + T_e/T_i) \approx 100 D_i$ ### Neutral Transport For reactive neutrals (radicals), Fickian diffusion: $$ \frac{\partial n}{\partial t} = D\nabla^2 n + S - L $$ #### Surface Boundary Condition $$ -D\frac{\partial n}{\partial x}\bigg|_{\text{surface}} = \frac{1}{4}\gamma n v_{\text{th}} $$ **Where:** - $\gamma$ — Sticking/reaction coefficient (0 to 1) - $v_{\text{th}} = \sqrt{\frac{8k_BT}{\pi m}}$ — Thermal velocity ### Knudsen Number Determines the appropriate transport regime: $$ \text{Kn} = \frac{\lambda}{L} $$ **Where:** - $\lambda$ — Mean free path - $L$ — Characteristic length | Kn Range | Regime | Model | |----------|--------|-------| | $< 0.01$ | Continuum | Navier-Stokes | | $0.01–0.1$ | Slip flow | Modified N-S | | $0.1–10$ | Transition | DSMC/BGK | | $> 10$ | Free molecular | Ballistic | ## Surface Reaction Modeling ### Langmuir Adsorption Kinetics For surface coverage $\theta$: $$ \frac{d\theta}{dt} = k_{\text{ads}}(1-\theta)P - k_{\text{des}}\theta - k_{\text{react}}\theta $$ **At steady state:** $$ \theta = \frac{k_{\text{ads}}P}{k_{\text{ads}}P + k_{\text{des}} + k_{\text{react}}} $$ ### Ion-Enhanced Etching The total etch rate combines multiple mechanisms: $$ \text{ER} = Y_{\text{chem}} \Gamma_n + Y_{\text{phys}} \Gamma_i + Y_{\text{syn}} \Gamma_i f(\theta) $$ **Where:** - $Y_{\text{chem}}$ — Chemical etch yield (isotropic) - $Y_{\text{phys}}$ — Physical sputtering yield - $Y_{\text{syn}}$ — Ion-enhanced (synergistic) yield - $\Gamma_n$, $\Gamma_i$ — Neutral and ion fluxes - $f(\theta)$ — Coverage-dependent function ### Ion Sputtering Yield #### Energy Dependence $$ Y(E) = A\left(\sqrt{E} - \sqrt{E_{\text{th}}}\right) \quad \text{for } E > E_{\text{th}} $$ **Typical threshold energies:** - Si: $E_{\text{th}} \approx 20$ eV - SiO₂: $E_{\text{th}} \approx 30$ eV - Si₃N₄: $E_{\text{th}} \approx 25$ eV #### Angular Dependence $$ Y(\theta) = Y(0) \cos^{-f}(\theta) \exp\left[-b\left(\frac{1}{\cos\theta} - 1\right)\right] $$ **Behavior:** - Increases from normal incidence - Peaks at $\theta \approx 60°–70°$ - Decreases at grazing angles (reflection dominates) ## Feature-Scale Profile Evolution ### Level Set Method The surface is represented as the zero contour of $\phi(\mathbf{x}, t)$: $$ \frac{\partial \phi}{\partial t} + V_n |\nabla \phi| = 0 $$ **Where:** - $\phi > 0$ — Material - $\phi < 0$ — Void/vacuum - $\phi = 0$ — Surface - $V_n$ — Local normal etch velocity ### Local Etch Rate Calculation The normal velocity $V_n$ depends on: 1. **Ion flux and angular distribution** $$\Gamma_i(\mathbf{x}) = \int f(\theta, E) \, d\Omega \, dE$$ 2. **Neutral flux** (with shadowing) $$\Gamma_n(\mathbf{x}) = \Gamma_{n,0} \cdot \text{VF}(\mathbf{x})$$ where VF is the view factor 3. **Surface chemistry state** $$V_n = f(\Gamma_i, \Gamma_n, \theta_{\text{coverage}}, T)$$ ### Neutral Transport in High-Aspect-Ratio Features #### Clausing Transmission Factor For a tube of aspect ratio AR: $$ K \approx \frac{1}{1 + 0.5 \cdot \text{AR}} $$ #### View Factor Calculations For surface element $dA_1$ seeing $dA_2$: $$ F_{1 \rightarrow 2} = \frac{1}{\pi} \int \frac{\cos\theta_1 \cos\theta_2}{r^2} \, dA_2 $$ ## Monte Carlo Methods ### Test-Particle Monte Carlo Algorithm ``` 1. SAMPLE incident particle from flux distribution at feature opening - Ion: from IEDF and IADF - Neutral: from Maxwellian 2. TRACE trajectory through feature - Ion: ballistic, solve equation of motion - Neutral: random walk with wall collisions 3. DETERMINE reaction at surface impact - Sample from probability distribution - Update surface coverage if adsorption 4. UPDATE surface geometry - Remove material (etching) - Add material (deposition) 5. REPEAT for statistically significant sample ``` ### Ion Trajectory Integration Through the sheath/feature: $$ m\frac{d^2\mathbf{r}}{dt^2} = q\mathbf{E}(\mathbf{r}) $$ **Numerical integration:** Velocity-Verlet or Boris algorithm ### Collision Sampling Null-collision method for efficiency: $$ P_{\text{collision}} = 1 - \exp(-\nu_{\text{max}} \Delta t) $$ **Where** $\nu_{\text{max}}$ is the maximum possible collision frequency. ## Multi-Scale Modeling Framework ### Scale Hierarchy | Scale | Length | Time | Physics | Method | |-------|--------|------|---------|--------| | **Reactor** | cm–m | ms–s | Plasma transport, EM fields | Fluid PDE | | **Sheath** | µm–mm | µs–ms | Ion acceleration, EEDF | Kinetic/Fluid | | **Feature** | nm–µm | ns–ms | Profile evolution | Level set/MC | | **Atomic** | Å–nm | ps–ns | Reaction mechanisms | MD/DFT | ### Coupling Approaches #### Hierarchical (One-Way) ``` Atomic scale → Surface parameters ↓ Feature scale ← Fluxes from reactor scale ↓ Reactor scale → Process outputs ``` #### Concurrent (Two-Way) - Feature-scale results feed back to reactor scale - Requires iterative solution - Computationally expensive ## Numerical Methods and Challenges ### Stiff ODE Systems Plasma chemistry involves timescales spanning many orders of magnitude: | Process | Timescale | |---------|-----------| | Electron attachment | $\sim 10^{-10}$ s | | Ion-molecule reactions | $\sim 10^{-6}$ s | | Metastable decay | $\sim 10^{-3}$ s | | Surface diffusion | $\sim 10^{-1}$ s | #### Implicit Methods Required **Backward Differentiation Formula (BDF):** $$ y_{n+1} = \sum_{j=0}^{k-1} \alpha_j y_{n-j} + h\beta f(t_{n+1}, y_{n+1}) $$ ### Spatial Discretization #### Finite Volume Method Ensures mass conservation: $$ \int_V \frac{\partial n}{\partial t} dV + \oint_S \mathbf{\Gamma} \cdot d\mathbf{S} = \int_V S \, dV $$ #### Mesh Requirements - Sheath resolution: $\Delta x < \lambda_D$ - RF skin depth: $\Delta x < \delta$ - Adaptive mesh refinement (AMR) common ### EM-Plasma Coupling **Iterative scheme:** 1. Solve Maxwell's equations for $\mathbf{E}$, $\mathbf{B}$ 2. Update plasma transport (density, temperature) 3. Recalculate $\sigma$, $\varepsilon_{\text{plasma}}$ 4. Repeat until convergence ## Advanced Topics ### Atomic Layer Etching (ALE) Self-limiting reactions for atomic precision: $$ \text{EPC} = \Theta \cdot d_{\text{ML}} $$ **Where:** - EPC — Etch per cycle - $\Theta$ — Modified layer coverage fraction - $d_{\text{ML}}$ — Monolayer thickness #### ALE Cycle 1. **Modification step:** Reactive gas creates modified surface layer $$\frac{d\Theta}{dt} = k_{\text{mod}}(1-\Theta)P_{\text{gas}}$$ 2. **Removal step:** Ion bombardment removes modified layer only $$\text{ER} = Y_{\text{mod}}\Gamma_i\Theta$$ ### Pulsed Plasma Dynamics Time-modulated RF introduces: - **Active glow:** Plasma on, high ion/radical generation - **Afterglow:** Plasma off, selective chemistry #### Ion Energy Modulation By pulsing bias: $$ \langle E_i \rangle = \frac{1}{T}\left[\int_0^{t_{\text{on}}} E_{\text{high}}dt + \int_{t_{\text{on}}}^{T} E_{\text{low}}dt\right] $$ ### High-Aspect-Ratio Etching (HAR) For AR > 50 (memory, 3D NAND): **Challenges:** - Ion angular broadening → bowing - Neutral depletion at bottom - Feature charging → twisting - Mask erosion → tapering **Ion Angular Distribution Broadening:** $$ \sigma_{\text{effective}} = \sqrt{\sigma_{\text{sheath}}^2 + \sigma_{\text{scattering}}^2} $$ **Neutral Flux at Bottom:** $$ \Gamma_{\text{bottom}} \approx \Gamma_{\text{top}} \cdot K(\text{AR}) $$ ### Machine Learning Integration **Applications:** - Surrogate models for fast prediction - Process optimization (Bayesian) - Virtual metrology - Anomaly detection **Physics-Informed Neural Networks (PINNs):** $$ \mathcal{L} = \mathcal{L}_{\text{data}} + \lambda \mathcal{L}_{\text{physics}} $$ Where $\mathcal{L}_{\text{physics}}$ enforces governing equations. ## Validation and Experimental Techniques ### Plasma Diagnostics | Technique | Measurement | Typical Values | |-----------|-------------|----------------| | **Langmuir probe** | $n_e$, $T_e$, EEDF | $10^{9}–10^{12}$ cm⁻³, 1–5 eV | | **OES** | Relative species densities | Qualitative/semi-quantitative | | **APMS** | Ion mass, energy | 1–500 amu, 0–500 eV | | **LIF** | Absolute radical density | $10^{11}–10^{14}$ cm⁻³ | | **Microwave interferometry** | $n_e$ (line-averaged) | $10^{10}–10^{12}$ cm⁻³ | ### Etch Characterization - **Profilometry:** Etch depth, uniformity - **SEM/TEM:** Feature profiles, sidewall angle - **XPS:** Surface composition - **Ellipsometry:** Film thickness, optical properties ### Model Validation Workflow 1. **Plasma validation:** Match $n_e$, $T_e$, species densities 2. **Flux validation:** Compare ion/neutral fluxes to wafer 3. **Etch rate validation:** Blanket wafer etch rates 4. **Profile validation:** Patterned feature cross-sections ## Key Dimensionless Numbers Summary | Number | Definition | Physical Meaning | |--------|------------|------------------| | **Knudsen** | $\text{Kn} = \lambda/L$ | Continuum vs. kinetic | | **Damköhler** | $\text{Da} = \tau_{\text{transport}}/\tau_{\text{reaction}}$ | Transport vs. reaction limited | | **Sticking coefficient** | $\gamma = \text{reactions}/\text{collisions}$ | Surface reactivity | | **Aspect ratio** | $\text{AR} = \text{depth}/\text{width}$ | Feature geometry | | **Debye number** | $N_D = n\lambda_D^3$ | Plasma ideality | ## Physical Constants | Constant | Symbol | Value | |----------|--------|-------| | Elementary charge | $e$ | $1.602 \times 10^{-19}$ C | | Electron mass | $m_e$ | $9.109 \times 10^{-31}$ kg | | Proton mass | $m_p$ | $1.673 \times 10^{-27}$ kg | | Boltzmann constant | $k_B$ | $1.381 \times 10^{-23}$ J/K | | Vacuum permittivity | $\varepsilon_0$ | $8.854 \times 10^{-12}$ F/m | | Vacuum permeability | $\mu_0$ | $4\pi \times 10^{-7}$ H/m |