Semiconductor Manufacturing: Multi-Scale Problems and Mathematical Modeling

Semiconductor Manufacturing: Multi-Scale Problems and Mathematical Modeling

1. The Multi-Scale Hierarchy

Semiconductor manufacturing spans roughly 12 orders of magnitude in length scale, each with distinct physics:

| Scale | Range | Phenomena | Mathematical Approach |
|-------|-------|-----------|----------------------|
| Quantum/Atomic | 0.1–1 nm | Bond formation, electron tunneling, reaction barriers | DFT, quantum chemistry |
| Molecular | 1–10 nm | Surface reactions, nucleation, atomic diffusion | Kinetic Monte Carlo, MD |
| Feature | 10 nm – 1 μm | Line edge roughness, profile evolution, grain structure | Level set, phase field |
| Device | 1–100 μm | Transistor variability, local stress | Continuum FEM |
| Die | 1–10 mm | Pattern density effects, thermal gradients | PDE-based continuum |
| Wafer | 300 mm | Global uniformity, edge effects | Equipment-scale models |
| Reactor | ~1 m | Plasma distribution, gas flow | CFD, plasma fluid models |

Fundamental Challenge

Physics at each scale influences adjacent scales, creating coupled nonlinear systems with vastly different characteristic times and lengths.

2. Key Processes and Mathematical Structure

2.1 Plasma Etching — The Most Complex Multi-Scale Problem

2.1.1 Reactor Scale (Continuum)

Electron density evolution:

$$
\frac{\partial n_e}{\partial t} +
abla \cdot \boldsymbol{\Gamma}_e = S_e - L_e
$$

Ion density evolution:

$$
\frac{\partial n_i}{\partial t} +
abla \cdot \boldsymbol{\Gamma}_i = S_i - L_i
$$

Poisson equation for electric potential:

abla^2 \phi = -\frac{e}{\epsilon_0}(n_i - n_e)
$$

Where:

- $n_e$, $n_i$ = electron and ion densities
- $\boldsymbol{\Gamma}_e$, $\boldsymbol{\Gamma}_i$ = electron and ion fluxes
- $S_e$, $S_i$ = source terms (ionization)
- $L_e$, $L_i$ = loss terms (recombination)
- $\phi$ = electric potential
- $e$ = elementary charge
- $\epsilon_0$ = permittivity of free space

2.1.2 Feature Scale — Profile Evolution via Level Set

Level set equation:

$$
\frac{\partial \phi}{\partial t} + V_n |
abla \phi| = 0
$$

Where:

- $\phi(x,t) = 0$ defines the evolving surface
- $V_n$ = local etch rate (normal velocity)

The local etch rate $V_n$ depends on:

- Ion flux and angle distribution (from sheath physics)
- Neutral species flux (from transport)
- Surface chemistry (from atomic-scale kinetics)

2.1.3 The Coupling Problem

The feature-scale etch rate $V_n$ requires:

- Ion angular/energy distributions → from sheath models
- Sheath models → depend on plasma conditions
- Plasma conditions → affected by loading (total surface area being etched)

This creates a global-to-local-to-global feedback loop.

2.2 Chemical Vapor Deposition (CVD) / Atomic Layer Deposition (ALD)

2.2.1 Gas-Phase Transport (Continuum)

Navier-Stokes momentum equation:

$$
\rho\left(\frac{\partial \mathbf{u}}{\partial t} + \mathbf{u} \cdot
abla \mathbf{u}\right) = -
abla p + \mu
abla^2 \mathbf{u}
$$

Species transport equation:

$$
\frac{\partial C_k}{\partial t} + \mathbf{u} \cdot
abla C_k = D_k
abla^2 C_k + R_k
$$

Where:

- $\rho$ = gas density
- $\mathbf{u}$ = velocity field
- $p$ = pressure
- $\mu$ = dynamic viscosity
- $C_k$ = concentration of species $k$
- $D_k$ = diffusion coefficient
- $R_k$ = reaction rate

2.2.2 Surface Kinetics (Stochastic/Molecular)

Adsorption rate:

$$
r_{ads} = s_0 \cdot f(\theta) \cdot F
$$

Where:

- $s_0$ = sticking coefficient
- $f(\theta)$ = coverage-dependent function
- $F$ = incident flux

Surface diffusion hopping rate:

u =
u_0 \exp\left(-\frac{E_a}{k_B T}\right)
$$

Where:

- $
u_0$ = attempt frequency
- $E_a$ = activation energy
- $k_B$ = Boltzmann constant
- $T$ = temperature

2.2.3 Mathematical Tension

Gas-phase transport is deterministic continuum; surface evolution involves discrete stochastic events. The boundary condition for the continuum problem depends on atomistic surface dynamics.

2.3 Lithography

2.3.1 Aerial Image Formation (Wave Optics)

Hopkins formulation for partially coherent imaging:

$$
I(\mathbf{r}) = \sum_j w_j \left| \iint M(f_x, f_y) H_j(f_x, f_y) e^{2\pi i(f_x x + f_y y)} \, df_x \, df_y \right|^2
$$

Where:

- $I(\mathbf{r})$ = image intensity at position $\mathbf{r}$
- $M(f_x, f_y)$ = mask spectrum (Fourier transform of mask pattern)
- $H_j(f_x, f_y)$ = pupil function for source point $j$
- $w_j$ = weight for source point $j$

2.3.2 Photoresist Chemistry

Exposure (photoactive compound destruction):

$$
\frac{\partial m}{\partial t} = -C \cdot I \cdot m
$$

Post-exposure bake diffusion (acid diffusion):

$$
\frac{\partial h}{\partial t} = D_h
abla^2 h
$$

Development rate (Mack model):

$$
R = R_0 \frac{(1-m)^n + \epsilon}{(1-m)^n + 1}
$$

Where:

- $m$ = normalized photoactive compound concentration
- $C$ = exposure rate constant
- $I$ = intensity
- $h$ = acid concentration
- $D_h$ = acid diffusion coefficient
- $R_0$ = maximum development rate
- $n$ = dissolution selectivity parameter
- $\epsilon$ = dissolution rate ratio

2.3.3 Stochastic Challenge at Advanced Nodes

At EUV wavelength (13.5 nm), photon shot noise becomes significant:

$$
\text{Fluctuation} \sim \frac{1}{\sqrt{N}}
$$

Where $N$ = number of photons per feature area.

This translates to line edge roughness (LER) of ~2-3 nm — comparable to feature dimensions.

2.4 Diffusion and Annealing

Classical Fick's law fails because:

- Diffusion is mediated by point defects (vacancies, interstitials)
- Defect concentrations depend on dopant concentration
- Stress affects diffusion
- Transient enhanced diffusion during implant damage annealing

Five-Stream Model

$$
\frac{\partial C_s}{\partial t} =
abla \cdot (D_s
abla C_s) + \text{reactions with } C_I, C_V, C_{As}, C_{AV}, \ldots
$$

Where:

- $C_s$ = substitutional dopant concentration
- $C_I$ = interstitial concentration
- $C_V$ = vacancy concentration
- $C_{As}$ = dopant-interstitial pair concentration
- $C_{AV}$ = dopant-vacancy pair concentration

This creates a coupled nonlinear system of 5+ PDEs with concentration-dependent coefficients spanning time scales from picoseconds to hours.

3. Mathematical Frameworks for Multi-Scale Coupling

3.1 Homogenization Theory

For problems with periodic microstructure at scale $\epsilon$:

$$
-
abla \cdot \left( A^\epsilon(x)
abla u^\epsilon \right) = f
$$

Where $A^\epsilon(x) = A(x/\epsilon)$ oscillates rapidly.

Two-Scale Expansion

$$
u^\epsilon(x) = u_0\left(x, \frac{x}{\epsilon}\right) + \epsilon \, u_1\left(x, \frac{x}{\epsilon}\right) + \epsilon^2 \, u_2\left(x, \frac{x}{\epsilon}\right) + \ldots
$$

This yields an effective coefficient $A^*$ that captures microscale physics in a macroscale equation.

Rigorous for linear elliptic problems; much harder for nonlinear, time-dependent cases in manufacturing.

3.2 Heterogeneous Multiscale Method (HMM)

Key Idea: Run microscale simulations only where/when needed to extract effective properties for the macroscale solver.

``┌────────────────────────────────────────┐ │ MACRO SOLVER (continuum PDE) │ │ Uses effective coefficients D, k │ └──────────────────┬─────────────────────┘ │ Query at macro points ▼ ┌────────────────────────────────────────┐ │ MICRO SIMULATIONS (MD, KMC, etc.) │ │ Constrained by local macro state │ │ Returns averaged properties │ └────────────────────────────────────────┘`

Mathematical Formulation

Macro equation:

$$ \frac{\partial U}{\partial t} = F\left(U, D^*(U)\right) $$

Micro-to-macro coupling:

$$ D^*(U) = \langle d(u) \rangle_{\text{micro}} $$

Where the micro simulation is constrained by the macroscopic state $U$.

3.3 Kinetic-Continuum Transition

Boltzmann Equation

$$ \frac{\partial f}{\partial t} + \mathbf{v} \cdot abla_x f + \frac{\mathbf{F}}{m} \cdot abla_v f = Q(f,f) $$

Where:

- $f(\mathbf{x}, \mathbf{v}, t)$ = distribution function - $\mathbf{v}$ = velocity - $\mathbf{F}$ = external force - $m$ = particle mass - $Q(f,f)$ = collision operator

Chapman-Enskog Expansion

Derives Navier-Stokes equations in the limit:

$$ Kn \to 0 $$

Where the Knudsen number is defined as:

$$ Kn = \frac{\lambda}{L} $$

- $\lambda$ = mean free path - $L$ = characteristic length

Spatial Variation of Knudsen Number

| Region | Knudsen Number | Valid Model | |--------|---------------|-------------| | Bulk reactor | $Kn \ll 1$ | Continuum (Navier-Stokes) | | Feature trenches | $Kn \sim 1$ | Transitional regime | | Surfaces, small features | $Kn \gg 1$ | Kinetic (Boltzmann) |

3.4 Level Set and Phase Field Methods

3.4.1 Level Set Method

Interface definition: $\{\mathbf{x} : \phi(\mathbf{x},t) = 0\}$

Evolution equation:

$$ \frac{\partial \phi}{\partial t} + V_n | abla \phi| = 0 $$

Advantages:

- Handles topology changes naturally (merging, splitting) - Implicit representation avoids mesh issues

Challenges:

- Maintaining $| abla \phi| = 1$ (signed distance property) - Velocity extension from interface to entire domain

3.4.2 Phase Field Method

Diffuse interface evolution:

$$ \frac{\partial \phi}{\partial t} = M\left[\epsilon^2 abla^2 \phi - f'(\phi) + \lambda g'(\phi)\right] $$

Where:

- $M$ = mobility - $\epsilon$ = interface width parameter - $f(\phi)$ = double-well potential - $g(\phi)$ = driving force - $\lambda$ = coupling constant

Advantages:

- No explicit interface tracking required - Natural handling of complex morphologies

Challenges:

- Resolving thin interface requires fine mesh - Selecting appropriate interface width $\epsilon$

4. Fundamental Mathematical Challenges

4.1 Stiffness and Time-Scale Separation

| Process | Characteristic Time | |---------|-------------------| | Electron dynamics | $10^{-12}$ s | | Surface reactions | $10^{-9}$ – $10^{-6}$ s | | Gas transport | $10^{-3}$ s | | Feature evolution | $1$ – $10^{2}$ s | | Wafer processing | $10^{2}$ – $10^{4}$ s |

Time scale ratio: $\sim 10^{16}$ between fastest and slowest processes.

Direct simulation is impossible.

Solution Strategies

- Implicit time integration with adaptive stepping - Quasi-steady state approximations for fast variables - Operator splitting: Treat different physics on different time scales - Averaging/homogenization to eliminate fast oscillations

4.2 High Dimensionality

The kinetic description $f(\mathbf{x}, \mathbf{v}, t)$ lives in 6D phase space.

Adding internal energy states and multiple species → intractable.

Reduction Strategies

- Moment methods: Track $\langle 1, v, v^2, \ldots \rangle_v$ rather than full $f$ - Monte Carlo: Sample from distribution rather than discretizing - Proper Orthogonal Decomposition (POD): Find low-dimensional subspace - Neural network surrogates: Learn mapping from inputs to outputs

4.3 Stochastic Effects at Nanoscale

At sub-10nm, continuum assumptions fail due to:

- Discreteness of atoms: Can't average over enough atoms - Shot noise: Finite number of photons, ions, molecules - Line edge roughness: Atomic-scale randomness in edge positions

Mathematical Treatment

Stochastic PDEs (Langevin form):

$$ du = \mathcal{L}u \, dt + \sigma \, dW $$

Where $dW$ is a Wiener process increment.

Master equation:

$$ \frac{dP_n}{dt} = \sum_m \left( W_{nm} P_m - W_{mn} P_n \right) $$

Where:

- $P_n$ = probability of state $n$ - $W_{nm}$ = transition rate from state $m$ to state $n$

Kinetic Monte Carlo: Direct simulation of discrete events with proper time advancement.

4.4 Inverse Problems and Control

Forward problem: Given process parameters → predict outcome

Inverse problem: Given desired outcome → find parameters

Manufacturing Requirements

- Recipe optimization - Run-to-run control - Fault detection/classification

Mathematical Challenges

- Ill-posedness: Multiple solutions, sensitivity to noise - High dimensionality of parameter space - Real-time constraints for feedback control

Approaches

- Regularization: Tikhonov, sparse methods - Bayesian inference: Uncertainty quantification - Optimal control theory: Adjoint methods - Surrogate-based optimization: Using ML models

5. Current Frontiers

5.1 Physics-Informed Machine Learning

Loss Function Structure

$$ \mathcal{L} = \mathcal{L}_{\text{data}} + \lambda_{\text{physics}} \mathcal{L}_{\text{PDE}} + \lambda_{\text{BC}} \mathcal{L}_{\text{boundary}} $$

Where:

- $\mathcal{L}_{\text{data}}$ = data fitting loss - $\mathcal{L}_{\text{PDE}}$ = physics constraint (PDE residual) - $\mathcal{L}_{\text{boundary}}$ = boundary condition constraint - $\lambda$ = weighting hyperparameters

Methods

- Physics-Informed Neural Networks (PINNs): Embed governing equations as soft constraints - Neural operators (DeepONet, FNO): Learn mappings between function spaces - Hybrid models: Combine physics-based and data-driven components

Challenges Specific to Semiconductor Manufacturing

- Sparse experimental data (wafers are expensive) - Extrapolation to new process conditions - Interpretability requirements for process understanding - Certification for high-reliability applications

5.2 Uncertainty Quantification at Scale

Manufacturing requires predicting distributions, not just means:

- What is $P(\text{yield} > 0.95)$? - What is the 99th percentile of line width variation?

Polynomial Chaos Expansion

$$ u(\mathbf{x}, \boldsymbol{\xi}) = \sum_{k} u_k(\mathbf{x}) \Psi_k(\boldsymbol{\xi}) $$

Where:

- $\boldsymbol{\xi}$ = random input parameters - $\Psi_k$ = orthogonal polynomial basis functions - $u_k(\mathbf{x})$ = deterministic coefficient functions

Challenge: Curse of Dimensionality

50+ random input parameters is common in semiconductor manufacturing.

Solutions

- Sparse polynomial chaos - Active subspaces (dimension reduction) - Multi-fidelity methods (combine cheap/accurate models)

5.3 Quantum Effects at Sub-Nanometer Scale

As features approach ~1 nm:

- Quantum tunneling through gate oxides - Quantum confinement affects electron states - Atomistic variability in dopant positions → device-to-device variation

Non-Equilibrium Green's Function (NEGF) Method

For quantum transport:

$$ G^R(E) = \left[ (E + i\eta)I - H - \Sigma^R \right]^{-1} $$

Where:

- $G^R$ = retarded Green's function - $E$ = energy - $H$ = Hamiltonian - $\Sigma^R$ = self-energy (contact + scattering) - $\eta$ = infinitesimal positive number

6. Conceptual Framework

Unified View of Multi-Scale Modeling

`ATOMISTIC MESOSCALE CONTINUUM EQUIPMENT (QM/MD/KMC) (Phase field, (CFD, FEM, (Reactor-scale Level set) Drift-diff) transport) │ │ │ │ │ Coarse │ Averaging │ Lumped │ ├───graining────►├──────────────────►├───parameters───►│ │ │ │ │ │◄──Boundary ────┤◄──Effective ──────┤◄──Boundary──────┤ │ conditions │ coefficients │ conditions │ │ │ │ │ ─────┴────────────────┴───────────────────┴─────────────────┴───── Information flow (bidirectional coupling)``

Key Mathematical Requirements

- Consistency: Coarse-grained models recover fine-scale physics in appropriate limits
- Conservation: Mass, momentum, energy preserved across scales
- Efficiency: Computational cost scales with information content, not raw degrees of freedom
- Adaptivity: Automatically refine where and when needed

7. Open Mathematical Problems

| Problem | Current State | Mathematical Need |
|---------|--------------|-------------------|
| Stochastic feature-scale modeling | KMC possible but expensive | Fast stochastic PDE methods |
| Plasma-surface coupling | Often one-way coupling | Consistent two-way coupling with rigorous error bounds |
| Real-time model-predictive control | Simplified ROMs | Fast surrogates with guaranteed accuracy |
| Variability prediction | Expensive Monte Carlo | Efficient UQ for high-dimensional inputs |
| Atomic-to-device coupling | Sequential handoff | Concurrent adaptive methods |
| Inverse design | Local optimization | Global optimization in high dimensions |

Key Equations Summary

Transport Equations

$$
\text{Continuity:} \quad \frac{\partial \rho}{\partial t} +
abla \cdot (\rho \mathbf{u}) = 0
$$

$$
\text{Momentum:} \quad \rho \frac{D\mathbf{u}}{Dt} = -
abla p + \mu
abla^2 \mathbf{u} + \mathbf{f}
$$

$$
\text{Energy:} \quad \rho c_p \frac{DT}{Dt} = k
abla^2 T + \dot{q}
$$

$$
\text{Species:} \quad \frac{\partial C_k}{\partial t} +
abla \cdot (C_k \mathbf{u}) = D_k
abla^2 C_k + R_k
$$

Interface Evolution

$$
\text{Level Set:} \quad \frac{\partial \phi}{\partial t} + V_n |
abla \phi| = 0
$$

$$
\text{Phase Field:} \quad \tau \frac{\partial \phi}{\partial t} = \epsilon^2
abla^2 \phi - f'(\phi)
$$

Kinetic Theory

$$
\text{Boltzmann:} \quad \frac{\partial f}{\partial t} + \mathbf{v} \cdot
abla_x f + \frac{\mathbf{F}}{m} \cdot
abla_v f = Q(f,f)
$$

$$
\text{Knudsen Number:} \quad Kn = \frac{\lambda}{L}
$$

Stochastic Modeling

$$
\text{Langevin SDE:} \quad dX = a(X,t) \, dt + b(X,t) \, dW
$$

$$
\text{Fokker-Planck:} \quad \frac{\partial p}{\partial t} = -
abla \cdot (a \, p) + \frac{1}{2}
abla^2 (b^2 p)
$$

Nomenclature

| Symbol | Description | Units |
|--------|-------------|-------|
| $\rho$ | Density | kg/m³ |
| $\mathbf{u}$ | Velocity vector | m/s |
| $p$ | Pressure | Pa |
| $T$ | Temperature | K |
| $C_k$ | Concentration of species $k$ | mol/m³ |
| $D_k$ | Diffusion coefficient | m²/s |
| $\phi$ | Level set function or phase field | — |
| $V_n$ | Normal interface velocity | m/s |
| $f$ | Distribution function | — |
| $Kn$ | Knudsen number | — |
| $\lambda$ | Mean free path | m |
| $E_a$ | Activation energy | J/mol |
| $k_B$ | Boltzmann constant | J/K |

Semiconductor Manufacturing: Multi-Scale Problems and Mathematical Modeling

Want to learn more?