Semiconductor Manufacturing: Multi-Scale Problems and Mathematical Modeling
1. The Multi-Scale Hierarchy
Semiconductor manufacturing spans roughly 12 orders of magnitude in length scale, each with distinct physics:
| Scale | Range | Phenomena | Mathematical Approach |
|-------|-------|-----------|----------------------|
| Quantum/Atomic | 0.1β1 nm | Bond formation, electron tunneling, reaction barriers | DFT, quantum chemistry |
| Molecular | 1β10 nm | Surface reactions, nucleation, atomic diffusion | Kinetic Monte Carlo, MD |
| Feature | 10 nm β 1 ΞΌm | Line edge roughness, profile evolution, grain structure | Level set, phase field |
| Device | 1β100 ΞΌm | Transistor variability, local stress | Continuum FEM |
| Die | 1β10 mm | Pattern density effects, thermal gradients | PDE-based continuum |
| Wafer | 300 mm | Global uniformity, edge effects | Equipment-scale models |
| Reactor | ~1 m | Plasma distribution, gas flow | CFD, plasma fluid models |
Fundamental Challenge
Physics at each scale influences adjacent scales, creating coupled nonlinear systems with vastly different characteristic times and lengths.
2. Key Processes and Mathematical Structure
2.1 Plasma Etching β The Most Complex Multi-Scale Problem
2.1.1 Reactor Scale (Continuum)
Electron density evolution:
$$
\frac{\partial n_e}{\partial t} +
abla \cdot \boldsymbol{\Gamma}_e = S_e - L_e
$$
Ion density evolution:
$$
\frac{\partial n_i}{\partial t} +
abla \cdot \boldsymbol{\Gamma}_i = S_i - L_i
$$
Poisson equation for electric potential:
$$
abla^2 \phi = -\frac{e}{\epsilon_0}(n_i - n_e)
$$
Where:
- $n_e$, $n_i$ = electron and ion densities
- $\boldsymbol{\Gamma}_e$, $\boldsymbol{\Gamma}_i$ = electron and ion fluxes
- $S_e$, $S_i$ = source terms (ionization)
- $L_e$, $L_i$ = loss terms (recombination)
- $\phi$ = electric potential
- $e$ = elementary charge
- $\epsilon_0$ = permittivity of free space
2.1.2 Feature Scale β Profile Evolution via Level Set
Level set equation:
$$
\frac{\partial \phi}{\partial t} + V_n |
abla \phi| = 0
$$
Where:
- $\phi(x,t) = 0$ defines the evolving surface
- $V_n$ = local etch rate (normal velocity)
The local etch rate $V_n$ depends on:
- Ion flux and angle distribution (from sheath physics)
- Neutral species flux (from transport)
- Surface chemistry (from atomic-scale kinetics)
2.1.3 The Coupling Problem
The feature-scale etch rate $V_n$ requires:
- Ion angular/energy distributions β from sheath models
- Sheath models β depend on plasma conditions
- Plasma conditions β affected by loading (total surface area being etched)
This creates a global-to-local-to-global feedback loop.
2.2 Chemical Vapor Deposition (CVD) / Atomic Layer Deposition (ALD)
2.2.1 Gas-Phase Transport (Continuum)
Navier-Stokes momentum equation:
$$
\rho\left(\frac{\partial \mathbf{u}}{\partial t} + \mathbf{u} \cdot
abla \mathbf{u}\right) = -
abla p + \mu
abla^2 \mathbf{u}
$$
Species transport equation:
$$
\frac{\partial C_k}{\partial t} + \mathbf{u} \cdot
abla C_k = D_k
abla^2 C_k + R_k
$$
Where:
- $\rho$ = gas density
- $\mathbf{u}$ = velocity field
- $p$ = pressure
- $\mu$ = dynamic viscosity
- $C_k$ = concentration of species $k$
- $D_k$ = diffusion coefficient
- $R_k$ = reaction rate
2.2.2 Surface Kinetics (Stochastic/Molecular)
Adsorption rate:
$$
r_{ads} = s_0 \cdot f(\theta) \cdot F
$$
Where:
- $s_0$ = sticking coefficient
- $f(\theta)$ = coverage-dependent function
- $F$ = incident flux
Surface diffusion hopping rate:
$$
u =
u_0 \exp\left(-\frac{E_a}{k_B T}\right)
$$
Where:
- $
u_0$ = attempt frequency
- $E_a$ = activation energy
- $k_B$ = Boltzmann constant
- $T$ = temperature
2.2.3 Mathematical Tension
Gas-phase transport is deterministic continuum; surface evolution involves discrete stochastic events. The boundary condition for the continuum problem depends on atomistic surface dynamics.
2.3 Lithography
2.3.1 Aerial Image Formation (Wave Optics)
Hopkins formulation for partially coherent imaging:
$$
I(\mathbf{r}) = \sum_j w_j \left| \iint M(f_x, f_y) H_j(f_x, f_y) e^{2\pi i(f_x x + f_y y)} \, df_x \, df_y \right|^2
$$
Where:
- $I(\mathbf{r})$ = image intensity at position $\mathbf{r}$
- $M(f_x, f_y)$ = mask spectrum (Fourier transform of mask pattern)
- $H_j(f_x, f_y)$ = pupil function for source point $j$
- $w_j$ = weight for source point $j$
2.3.2 Photoresist Chemistry
Exposure (photoactive compound destruction):
$$
\frac{\partial m}{\partial t} = -C \cdot I \cdot m
$$
Post-exposure bake diffusion (acid diffusion):
$$
\frac{\partial h}{\partial t} = D_h
abla^2 h
$$
Development rate (Mack model):
$$
R = R_0 \frac{(1-m)^n + \epsilon}{(1-m)^n + 1}
$$
Where:
- $m$ = normalized photoactive compound concentration
- $C$ = exposure rate constant
- $I$ = intensity
- $h$ = acid concentration
- $D_h$ = acid diffusion coefficient
- $R_0$ = maximum development rate
- $n$ = dissolution selectivity parameter
- $\epsilon$ = dissolution rate ratio
2.3.3 Stochastic Challenge at Advanced Nodes
At EUV wavelength (13.5 nm), photon shot noise becomes significant:
$$
\text{Fluctuation} \sim \frac{1}{\sqrt{N}}
$$
Where $N$ = number of photons per feature area.
This translates to line edge roughness (LER) of ~2-3 nm β comparable to feature dimensions.
2.4 Diffusion and Annealing
Classical Fick's law fails because:
- Diffusion is mediated by point defects (vacancies, interstitials)
- Defect concentrations depend on dopant concentration
- Stress affects diffusion
- Transient enhanced diffusion during implant damage annealing
Five-Stream Model
$$
\frac{\partial C_s}{\partial t} =
abla \cdot (D_s
abla C_s) + \text{reactions with } C_I, C_V, C_{As}, C_{AV}, \ldots
$$
Where:
- $C_s$ = substitutional dopant concentration
- $C_I$ = interstitial concentration
- $C_V$ = vacancy concentration
- $C_{As}$ = dopant-interstitial pair concentration
- $C_{AV}$ = dopant-vacancy pair concentration
This creates a coupled nonlinear system of 5+ PDEs with concentration-dependent coefficients spanning time scales from picoseconds to hours.
3. Mathematical Frameworks for Multi-Scale Coupling
3.1 Homogenization Theory
For problems with periodic microstructure at scale $\epsilon$:
$$
-
abla \cdot \left( A^\epsilon(x)
abla u^\epsilon \right) = f
$$
Where $A^\epsilon(x) = A(x/\epsilon)$ oscillates rapidly.
Two-Scale Expansion
$$
u^\epsilon(x) = u_0\left(x, \frac{x}{\epsilon}\right) + \epsilon \, u_1\left(x, \frac{x}{\epsilon}\right) + \epsilon^2 \, u_2\left(x, \frac{x}{\epsilon}\right) + \ldots
$$
This yields an effective coefficient $A^*$ that captures microscale physics in a macroscale equation.
Rigorous for linear elliptic problems; much harder for nonlinear, time-dependent cases in manufacturing.
3.2 Heterogeneous Multiscale Method (HMM)
Key Idea: Run microscale simulations only where/when needed to extract effective properties for the macroscale solver.
```
ββββββββββββββββββββββββββββββββββββββββββ
β MACRO SOLVER (continuum PDE) β
β Uses effective coefficients D, k β
ββββββββββββββββββββ¬ββββββββββββββββββββββ
β Query at macro points
βΌ
ββββββββββββββββββββββββββββββββββββββββββ
β MICRO SIMULATIONS (MD, KMC, etc.) β
β Constrained by local macro state β
β Returns averaged properties β
ββββββββββββββββββββββββββββββββββββββββββ
Mathematical Formulation
Macro equation:
$$
\frac{\partial U}{\partial t} = F\left(U, D^*(U)\right)
$$
Micro-to-macro coupling:
$$
D^*(U) = \langle d(u) \rangle_{\text{micro}}
$$
Where the micro simulation is constrained by the macroscopic state $U$.
3.3 Kinetic-Continuum Transition
Boltzmann Equation
$$
\frac{\partial f}{\partial t} + \mathbf{v} \cdot
abla_x f + \frac{\mathbf{F}}{m} \cdot
abla_v f = Q(f,f)
$$
Where:
- $f(\mathbf{x}, \mathbf{v}, t)$ = distribution function
- $\mathbf{v}$ = velocity
- $\mathbf{F}$ = external force
- $m$ = particle mass
- $Q(f,f)$ = collision operator
Chapman-Enskog Expansion
Derives Navier-Stokes equations in the limit:
$$
Kn \to 0
$$
Where the Knudsen number is defined as:
$$
Kn = \frac{\lambda}{L}
$$
- $\lambda$ = mean free path
- $L$ = characteristic length
Spatial Variation of Knudsen Number
| Region | Knudsen Number | Valid Model |
|--------|---------------|-------------|
| Bulk reactor | $Kn \ll 1$ | Continuum (Navier-Stokes) |
| Feature trenches | $Kn \sim 1$ | Transitional regime |
| Surfaces, small features | $Kn \gg 1$ | Kinetic (Boltzmann) |
3.4 Level Set and Phase Field Methods
3.4.1 Level Set Method
Interface definition: $\{\mathbf{x} : \phi(\mathbf{x},t) = 0\}$
Evolution equation:
$$
\frac{\partial \phi}{\partial t} + V_n |
abla \phi| = 0
$$
Advantages:
- Handles topology changes naturally (merging, splitting)
- Implicit representation avoids mesh issues
Challenges:
- Maintaining $|
abla \phi| = 1$ (signed distance property)
- Velocity extension from interface to entire domain
3.4.2 Phase Field Method
Diffuse interface evolution:
$$
\frac{\partial \phi}{\partial t} = M\left[\epsilon^2
abla^2 \phi - f'(\phi) + \lambda g'(\phi)\right]
$$
Where:
- $M$ = mobility
- $\epsilon$ = interface width parameter
- $f(\phi)$ = double-well potential
- $g(\phi)$ = driving force
- $\lambda$ = coupling constant
Advantages:
- No explicit interface tracking required
- Natural handling of complex morphologies
Challenges:
- Resolving thin interface requires fine mesh
- Selecting appropriate interface width $\epsilon$
4. Fundamental Mathematical Challenges
4.1 Stiffness and Time-Scale Separation
| Process | Characteristic Time |
|---------|-------------------|
| Electron dynamics | $10^{-12}$ s |
| Surface reactions | $10^{-9}$ β $10^{-6}$ s |
| Gas transport | $10^{-3}$ s |
| Feature evolution | $1$ β $10^{2}$ s |
| Wafer processing | $10^{2}$ β $10^{4}$ s |
Time scale ratio: $\sim 10^{16}$ between fastest and slowest processes.
Direct simulation is impossible.
Solution Strategies
- Implicit time integration with adaptive stepping
- Quasi-steady state approximations for fast variables
- Operator splitting: Treat different physics on different time scales
- Averaging/homogenization to eliminate fast oscillations
4.2 High Dimensionality
The kinetic description $f(\mathbf{x}, \mathbf{v}, t)$ lives in 6D phase space.
Adding internal energy states and multiple species β intractable.
Reduction Strategies
- Moment methods: Track $\langle 1, v, v^2, \ldots \rangle_v$ rather than full $f$
- Monte Carlo: Sample from distribution rather than discretizing
- Proper Orthogonal Decomposition (POD): Find low-dimensional subspace
- Neural network surrogates: Learn mapping from inputs to outputs
4.3 Stochastic Effects at Nanoscale
At sub-10nm, continuum assumptions fail due to:
- Discreteness of atoms: Can't average over enough atoms
- Shot noise: Finite number of photons, ions, molecules
- Line edge roughness: Atomic-scale randomness in edge positions
Mathematical Treatment
Stochastic PDEs (Langevin form):
$$
du = \mathcal{L}u \, dt + \sigma \, dW
$$
Where $dW$ is a Wiener process increment.
Master equation:
$$
\frac{dP_n}{dt} = \sum_m \left( W_{nm} P_m - W_{mn} P_n \right)
$$
Where:
- $P_n$ = probability of state $n$
- $W_{nm}$ = transition rate from state $m$ to state $n$
Kinetic Monte Carlo: Direct simulation of discrete events with proper time advancement.
4.4 Inverse Problems and Control
Forward problem: Given process parameters β predict outcome
Inverse problem: Given desired outcome β find parameters
Manufacturing Requirements
- Recipe optimization
- Run-to-run control
- Fault detection/classification
Mathematical Challenges
- Ill-posedness: Multiple solutions, sensitivity to noise
- High dimensionality of parameter space
- Real-time constraints for feedback control
Approaches
- Regularization: Tikhonov, sparse methods
- Bayesian inference: Uncertainty quantification
- Optimal control theory: Adjoint methods
- Surrogate-based optimization: Using ML models
5. Current Frontiers
5.1 Physics-Informed Machine Learning
Loss Function Structure
$$
\mathcal{L} = \mathcal{L}_{\text{data}} + \lambda_{\text{physics}} \mathcal{L}_{\text{PDE}} + \lambda_{\text{BC}} \mathcal{L}_{\text{boundary}}
$$
Where:
- $\mathcal{L}_{\text{data}}$ = data fitting loss
- $\mathcal{L}_{\text{PDE}}$ = physics constraint (PDE residual)
- $\mathcal{L}_{\text{boundary}}$ = boundary condition constraint
- $\lambda$ = weighting hyperparameters
Methods
- Physics-Informed Neural Networks (PINNs): Embed governing equations as soft constraints
- Neural operators (DeepONet, FNO): Learn mappings between function spaces
- Hybrid models: Combine physics-based and data-driven components
Challenges Specific to Semiconductor Manufacturing
- Sparse experimental data (wafers are expensive)
- Extrapolation to new process conditions
- Interpretability requirements for process understanding
- Certification for high-reliability applications
5.2 Uncertainty Quantification at Scale
Manufacturing requires predicting distributions, not just means:
- What is $P(\text{yield} > 0.95)$?
- What is the 99th percentile of line width variation?
Polynomial Chaos Expansion
$$
u(\mathbf{x}, \boldsymbol{\xi}) = \sum_{k} u_k(\mathbf{x}) \Psi_k(\boldsymbol{\xi})
$$
Where:
- $\boldsymbol{\xi}$ = random input parameters
- $\Psi_k$ = orthogonal polynomial basis functions
- $u_k(\mathbf{x})$ = deterministic coefficient functions
Challenge: Curse of Dimensionality
50+ random input parameters is common in semiconductor manufacturing.
Solutions
- Sparse polynomial chaos
- Active subspaces (dimension reduction)
- Multi-fidelity methods (combine cheap/accurate models)
5.3 Quantum Effects at Sub-Nanometer Scale
As features approach ~1 nm:
- Quantum tunneling through gate oxides
- Quantum confinement affects electron states
- Atomistic variability in dopant positions β device-to-device variation
Non-Equilibrium Green's Function (NEGF) Method
For quantum transport:
$$
G^R(E) = \left[ (E + i\eta)I - H - \Sigma^R \right]^{-1}
$$
Where:
- $G^R$ = retarded Green's function
- $E$ = energy
- $H$ = Hamiltonian
- $\Sigma^R$ = self-energy (contact + scattering)
- $\eta$ = infinitesimal positive number
6. Conceptual Framework
Unified View of Multi-Scale Modeling
```
ATOMISTIC MESOSCALE CONTINUUM EQUIPMENT
(QM/MD/KMC) (Phase field, (CFD, FEM, (Reactor-scale
Level set) Drift-diff) transport)
β β β β
β Coarse β Averaging β Lumped β
ββββgrainingβββββΊββββββββββββββββββββΊββββparametersββββΊβ
β β β β
ββββBoundary βββββ€βββEffective βββββββ€βββBoundaryβββββββ€
β conditions β coefficients β conditions β
β β β β
ββββββ΄βββββββββββββββββ΄ββββββββββββββββββββ΄ββββββββββββββββββ΄βββββ
Information flow (bidirectional coupling)
Key Mathematical Requirements
- Consistency: Coarse-grained models recover fine-scale physics in appropriate limits
- Conservation: Mass, momentum, energy preserved across scales
- Efficiency: Computational cost scales with information content, not raw degrees of freedom
- Adaptivity: Automatically refine where and when needed
7. Open Mathematical Problems
| Problem | Current State | Mathematical Need |
|---------|--------------|-------------------|
| Stochastic feature-scale modeling | KMC possible but expensive | Fast stochastic PDE methods |
| Plasma-surface coupling | Often one-way coupling | Consistent two-way coupling with rigorous error bounds |
| Real-time model-predictive control | Simplified ROMs | Fast surrogates with guaranteed accuracy |
| Variability prediction | Expensive Monte Carlo | Efficient UQ for high-dimensional inputs |
| Atomic-to-device coupling | Sequential handoff | Concurrent adaptive methods |
| Inverse design | Local optimization | Global optimization in high dimensions |
Key Equations Summary
Transport Equations
$$
\text{Continuity:} \quad \frac{\partial \rho}{\partial t} +
abla \cdot (\rho \mathbf{u}) = 0
$$
$$
\text{Momentum:} \quad \rho \frac{D\mathbf{u}}{Dt} = -
abla p + \mu
abla^2 \mathbf{u} + \mathbf{f}
$$
$$
\text{Energy:} \quad \rho c_p \frac{DT}{Dt} = k
abla^2 T + \dot{q}
$$
$$
\text{Species:} \quad \frac{\partial C_k}{\partial t} +
abla \cdot (C_k \mathbf{u}) = D_k
abla^2 C_k + R_k
$$
Interface Evolution
$$
\text{Level Set:} \quad \frac{\partial \phi}{\partial t} + V_n |
abla \phi| = 0
$$
$$
\text{Phase Field:} \quad \tau \frac{\partial \phi}{\partial t} = \epsilon^2
abla^2 \phi - f'(\phi)
$$
Kinetic Theory
$$
\text{Boltzmann:} \quad \frac{\partial f}{\partial t} + \mathbf{v} \cdot
abla_x f + \frac{\mathbf{F}}{m} \cdot
abla_v f = Q(f,f)
$$
$$
\text{Knudsen Number:} \quad Kn = \frac{\lambda}{L}
$$
Stochastic Modeling
$$
\text{Langevin SDE:} \quad dX = a(X,t) \, dt + b(X,t) \, dW
$$
$$
\text{Fokker-Planck:} \quad \frac{\partial p}{\partial t} = -
abla \cdot (a \, p) + \frac{1}{2}
abla^2 (b^2 p)
$$
Nomenclature
| Symbol | Description | Units |
|--------|-------------|-------|
| $\rho$ | Density | kg/mΒ³ |
| $\mathbf{u}$ | Velocity vector | m/s |
| $p$ | Pressure | Pa |
| $T$ | Temperature | K |
| $C_k$ | Concentration of species $k$ | mol/mΒ³ |
| $D_k$ | Diffusion coefficient | mΒ²/s |
| $\phi$ | Level set function or phase field | β |
| $V_n$ | Normal interface velocity | m/s |
| $f$ | Distribution function | β |
| $Kn$ | Knudsen number | β |
| $\lambda$ | Mean free path | m |
| $E_a$ | Activation energy | J/mol |
| $k_B$ | Boltzmann constant | J/K |