Chemical Mechanical Planarization (CMP) Modeling in Semiconductor Manufacturing
Keywords: CMP modeling, chemical mechanical polishing, CMP simulation, planarization, dishing, erosion
Chemical Mechanical Planarization (CMP) Modeling in Semiconductor Manufacturing
1. Fundamentals of CMP
1.1 Definition and Principle
Chemical Mechanical Planarization (CMP) is a hybrid process combining:
- Chemical etching: Reactive slurry chemistry modifies surface properties
- Mechanical abrasion: Physical removal via abrasive particles and pad
The fundamental material removal can be expressed as:
$$ \text{Material Removal} = f(\text{Chemical Reaction}, \text{Mechanical Abrasion}) $$
1.2 Process Components
| Component | Function | Key Parameters |
|---|---|---|
| Wafer | Substrate to be planarized | Material type, pattern density |
| Polishing Pad | Provides mechanical action | Hardness, porosity, asperity distribution |
| Slurry | Chemical + abrasive medium | pH, oxidizer, particle size/concentration |
| Carrier | Holds and rotates wafer | Down force, rotation speed |
| Platen | Rotates polishing pad | Rotation speed, temperature |
1.3 Key Process Parameters
- Down Force ($F$): Pressure applied to wafer, typically $1-7$ psi
- Platen Speed ($\omega_p$): Pad rotation, typically $20-100$ rpm
- Carrier Speed ($\omega_c$): Wafer rotation, typically $20-100$ rpm
- Slurry Flow Rate ($Q$): Typically $100-300$ mL/min
- Temperature ($T$): Typically $20-50Β°C$
2. Classical Physical Models
2.1 Preston Equation (Foundational Model)
The foundational model for CMP is the Preston equation (1927):
$$ \boxed{MRR = k_p \cdot P \cdot v} $$
Where:
- $MRR$ = Material Removal Rate $[\text{nm/min}]$
- $k_p$ = Preston's coefficient $[\text{m}^2/\text{N}]$
- $P$ = Applied pressure $[\text{Pa}]$
- $v$ = Relative velocity $[\text{m/s}]$
The relative velocity between wafer and pad:
$$ v = \sqrt{(\omega_p r_p)^2 + (\omega_c r_c)^2 - 2\omega_p \omega_c r_p r_c \cos(\theta)} $$
Where:
- $\omega_p, \omega_c$ = Angular velocities of platen and carrier
- $r_p, r_c$ = Radial positions
- $\theta$ = Phase angle
2.2 Modified Preston Models
2.2.1 Pressure-Velocity Product Modification
$$ MRR = k_p \cdot P^a \cdot v^b $$
Where $a, b$ are empirical exponents (typically $0.5 < a, b < 1.5$)
2.2.2 Chemical Enhancement Factor
$$ MRR = k_p \cdot P \cdot v \cdot f(C, T, pH) $$
Where $f(C, T, pH)$ represents chemical effects:
- $C$ = Oxidizer concentration
- $T$ = Temperature
- $pH$ = Slurry pH
2.2.3 Arrhenius-Modified Preston Equation
$$ MRR = k_0 \cdot \exp\left(-\frac{E_a}{RT}\right) \cdot P \cdot v $$
Where:
- $k_0$ = Pre-exponential factor
- $E_a$ = Activation energy $[\text{J/mol}]$
- $R$ = Gas constant $= 8.314$ J/(mol$\cdot$K)
- $T$ = Temperature $[\text{K}]$
2.3 Tribocorrosion Model
For metal CMP (e.g., tungsten, copper):
$$ MRR = \frac{M}{z F \rho} \cdot \left( i_{corr} + \frac{Q_{pass}}{A \cdot t_{pass}} \right) \cdot f_{mech} $$
Where:
- $M$ = Molar mass of metal
- $z$ = Number of electrons transferred
- $F$ = Faraday constant $= 96485$ C/mol
- $\rho$ = Density
- $i_{corr}$ = Corrosion current density
- $Q_{pass}$ = Passivation charge
- $f_{mech}$ = Mechanical factor
2.4 Contact Mode Classification
| Mode | Condition | Preston Constant | Friction Coefficient |
|---|---|---|---|
| Contact | $\frac{\eta v_R}{p} < (\frac{\eta v_R}{p})_c$ | High, constant | High ($\mu > 0.3$) |
| Mixed | $\frac{\eta v_R}{p} \approx (\frac{\eta v_R}{p})_c$ | Transitional | Medium |
| Hydroplaning | $\frac{\eta v_R}{p} > (\frac{\eta v_R}{p})_c$ | Low, variable | Low ($\mu < 0.1$) |
Where:
- $\eta$ = Slurry viscosity
- $v_R$ = Relative velocity
- $p$ = Pressure
3. Pattern Density Models
3.1 Effective Pattern Density Model (Stine Model)
The local material removal rate depends on effective pattern density:
$$ \frac{dz}{dt} = -\frac{K}{\rho_{eff}(x, y)} $$
Where:
- $z$ = Surface height
- $K$ = Blanket removal rate $= k_p \cdot P \cdot v$
- $\rho_{eff}$ = Effective pattern density
3.1.1 Effective Density Calculation
$$ \rho_{eff}(x, y) = \iint_{-\infty}^{\infty} \rho_0(x', y') \cdot W(x - x', y - y') \, dx' \, dy' $$
Where:
- $\rho_0(x, y)$ = Local pattern density
- $W(x, y)$ = Weighting function (planarization kernel)
3.1.2 Elliptical Weighting Function
$$ W(x, y) = \frac{1}{\pi L_x L_y} \cdot \exp\left(-\frac{x^2}{L_x^2} - \frac{y^2}{L_y^2}\right) $$
Where $L_x, L_y$ are planarization lengths in x and y directions.
3.2 Step Height Evolution Model
For oxide CMP with step height $h$:
$$ \frac{dh}{dt} = -K \cdot \left(1 - \frac{h_{contact}}{h}\right) \quad \text{for } h > h_{contact} $$
$$ \frac{dh}{dt} = 0 \quad \text{for } h \leq h_{contact} $$
Where $h_{contact}$ is the pad contact threshold height.
3.3 Integrated Density-Step Height Model
Combined model for oxide thickness evolution:
$$ z(x, y, t) = z_0 - K \cdot t \cdot \frac{1}{\rho_{eff}(x, y)} \cdot g(h) $$
Where $g(h)$ is the step-height dependent function:
$$ g(h) = \begin{cases} 1 & \text{if } h > h_c \\ \frac{h}{h_c} & \text{if } h \leq h_c \end{cases} $$
4. Dishing and Erosion Models
4.1 Copper Dishing Model
Dishing depth $D$ for copper lines:
$$ D = K_{Cu} \cdot t_{over} \cdot f(w) $$
Where:
- $K_{Cu}$ = Copper removal rate
- $t_{over}$ = Overpolish time
- $w$ = Line width
- $f(w)$ = Width-dependent function
Empirical relationship:
$$ D = D_0 \cdot \left(1 - \exp\left(-\frac{w}{w_c}\right)\right) $$
Where:
- $D_0$ = Maximum dishing depth
- $w_c$ = Critical line width
4.2 Oxide Erosion Model
Erosion $E$ in dense pattern regions:
$$ E = K_{ox} \cdot t_{over} \cdot \rho_{metal} $$
Where:
- $K_{ox}$ = Oxide removal rate
- $\rho_{metal}$ = Local metal pattern density
4.3 Combined Dishing-Erosion
Total copper thickness loss:
$$ \Delta z_{Cu} = D + E \cdot \frac{\rho_{metal}}{1 - \rho_{metal}} $$
4.4 Pattern Density Effects
| Pattern Density | Dishing Behavior | Erosion Behavior |
|---|---|---|
| Low ($< 20\%$) | Minimal | Minimal |
| Medium ($20-50\%$) | Moderate | Increasing |
| High ($> 50\%$) | Saturates | Severe |
5. Contact Mechanics Models
5.1 Pad Asperity Contact Model
Assuming Gaussian asperity height distribution:
$$ P(z) = \frac{1}{\sigma_s \sqrt{2\pi}} \exp\left(-\frac{(z - \bar{z})^2}{2\sigma_s^2}\right) $$
Where:
- $\sigma_s$ = Standard deviation of asperity heights
- $\bar{z}$ = Mean asperity height
5.2 Real Contact Area
$$ A_r = \pi n \int_{d}^{\infty} R(z - d) \cdot P(z) \, dz $$
Where:
- $n$ = Number of asperities per unit area
- $R$ = Asperity tip radius
- $d$ = Separation distance
For Gaussian distribution:
$$ A_r = \pi n R \sigma_s \cdot F_1\left(\frac{d}{\sigma_s}\right) $$
Where $F_1$ is a statistical function.
5.3 Hertzian Contact
For elastic contact between abrasive particle and wafer:
$$ a = \left(\frac{3FR}{4E^*}\right)^{1/3} $$
$$ \delta = \frac{a^2}{R} = \left(\frac{9F^2}{16RE^{*2}}\right)^{1/3} $$
Where:
- $a$ = Contact radius
- $F$ = Normal force
- $R$ = Particle radius
- $\delta$ = Indentation depth
- $E^*$ = Effective elastic modulus
$$ \frac{1}{E^*} = \frac{1 - u_1^2}{E_1} + \frac{1 - u_2^2}{E_2} $$
5.4 Material Removal by Single Abrasive
Volume removed per abrasive per pass:
$$ V = K_{wear} \cdot \frac{F_n \cdot L}{H} $$
Where:
- $K_{wear}$ = Wear coefficient
- $F_n$ = Normal force on particle
- $L$ = Sliding distance
- $H$ = Hardness of wafer material
5.5 Multi-Scale Model Framework
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WAFER SCALE (mm-cm) β
β Pressure distribution, global uniformity β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β DIE SCALE ($\mu$m-mm) β
β Pattern density effects, planarization β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β FEATURE SCALE (nm-$\mu$m) β
β Dishing, erosion, step height evolution β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PARTICLE SCALE (nm) β
β Abrasive-surface interactions β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β MOLECULAR SCALE (Γ
) β
β Chemical reactions, atomic removal β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
6. Machine Learning and Neural Network Models
6.1 Overview of ML Approaches
Machine learning methods for CMP modeling:
- Supervised Learning
- Artificial Neural Networks (ANN)
- Convolutional Neural Networks (CNN)
- Support Vector Machines (SVM)
- Random Forests / Gradient Boosting
- Deep Learning
- Deep Belief Networks (DBN)
- Long Short-Term Memory (LSTM)
- Generative Adversarial Networks (GAN)
- Transfer Learning
- Pre-trained models adapted to new process conditions
6.2 Neural Network Architecture for CMP
6.2.1 Input Features
$$ \mathbf{x} = [P, v, t, \rho, w, s, pH, C_{ox}, T, ...]^T $$
Where:
- $P$ = Pressure
- $v$ = Velocity
- $t$ = Polish time
- $\rho$ = Pattern density
- $w$ = Feature width
- $s$ = Feature spacing
- $pH$ = Slurry pH
- $C_{ox}$ = Oxidizer concentration
- $T$ = Temperature
6.2.2 Multi-Layer Perceptron (MLP)
$$ \mathbf{h}^{(1)} = \sigma(\mathbf{W}^{(1)} \mathbf{x} + \mathbf{b}^{(1)}) $$
$$ \mathbf{h}^{(2)} = \sigma(\mathbf{W}^{(2)} \mathbf{h}^{(1)} + \mathbf{b}^{(2)}) $$
$$ \hat{y} = \mathbf{W}^{(out)} \mathbf{h}^{(2)} + \mathbf{b}^{(out)} $$
Where:
- $\sigma$ = Activation function (ReLU, tanh, sigmoid)
- $\mathbf{W}^{(i)}$ = Weight matrices
- $\mathbf{b}^{(i)}$ = Bias vectors
6.2.3 Activation Functions
| Function | Formula | Use Case |
|---|---|---|
| ReLU | $\sigma(x) = \max(0, x)$ | Hidden layers |
| Sigmoid | $\sigma(x) = \frac{1}{1 + e^{-x}}$ | Output (binary) |
| Tanh | $\sigma(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$ | Hidden layers |
| Softmax | $\sigma(x_i) = \frac{e^{x_i}}{\sum_j e^{x_j}}$ | Classification |
6.3 CNN-Based CMP Modeling (CmpCNN)
6.3.1 Architecture
Input: Layout Image (Binary) + Density Map
β
Conv2D Layer (3Γ3 kernel, 32 filters)
β
MaxPooling2D (2Γ2)
β
Conv2D Layer (3Γ3 kernel, 64 filters)
β
MaxPooling2D (2Γ2)
β
Flatten
β
Dense Layer (256 units)
β
Dense Layer (128 units)
β
Output: Post-CMP Height Map
6.3.2 Convolution Operation
$$ (I * K)(i, j) = \sum_m \sum_n I(i+m, j+n) \cdot K(m, n) $$
Where:
- $I$ = Input image (layout)
- $K$ = Convolution kernel
- $(i, j)$ = Output position
6.4 Loss Functions
6.4.1 Mean Squared Error (MSE)
$$ \mathcal{L}_{MSE} = \frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2 $$
6.4.2 Root Mean Square Error (RMSE)
$$ RMSE = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (y_i - \hat{y}_i)^2} $$
6.4.3 Mean Absolute Percentage Error (MAPE)
$$ MAPE = \frac{100\%}{N} \sum_{i=1}^{N} \left| \frac{y_i - \hat{y}_i}{y_i} \right| $$
6.5 Transfer Learning Framework
For adapting models across process nodes:
$$ \mathcal{L}_{transfer} = \mathcal{L}_{target} + \lambda \cdot \mathcal{L}_{domain} $$
Where:
- $\mathcal{L}_{target}$ = Target domain loss
- $\mathcal{L}_{domain}$ = Domain adaptation loss
- $\lambda$ = Regularization parameter
6.6 Performance Metrics
| Metric | Formula | Target | ||
|---|---|---|---|---|
| $R^2$ | $1 - \frac{\sum(y_i - \hat{y}_i)^2}{\sum(y_i - \bar{y})^2}$ | $> 0.95$ | ||
| RMSE | $\sqrt{\frac{1}{N}\sum(y_i - \hat{y}_i)^2}$ | $< 5$ Γ | ||
| MAE | $\frac{1}{N}\sum | y_i - \hat{y}_i | $ | $< 3$ Γ |
7. Slurry Chemistry Modeling
7.1 Kaufman Mechanism
Cyclic passivation-depassivation process:
$$ \text{Metal} \xrightarrow{\text{Oxidizer}} \text{Metal Oxide} \xrightarrow{\text{Abrasion}} \text{Removal} $$
7.2 Electrochemical Reactions
7.2.1 Copper CMP
Oxidation: $$ \text{Cu} \rightarrow \text{Cu}^{2+} + 2e^- $$
Passivation (with BTA): $$ \text{Cu} + \text{BTA} \rightarrow \text{Cu-BTA}_{film} $$
Complexation: $$ \text{Cu}^{2+} + n\text{L} \rightarrow [\text{CuL}_n]^{2+} $$
Where L = chelating agent (e.g., glycine, citrate)
7.2.2 Tungsten CMP
Oxidation: $$ \text{W} + 3\text{H}_2\text{O} \rightarrow \text{WO}_3 + 6\text{H}^+ + 6e^- $$
With hydrogen peroxide: $$ \text{W} + 3\text{H}_2\text{O}_2 \rightarrow \text{WO}_3 + 3\text{H}_2\text{O} $$
7.3 Pourbaix Diagram Integration
Stability regions defined by:
$$ E = E^0 - \frac{RT}{nF} \ln Q - \frac{RT}{F} \cdot m \cdot pH $$
Where:
- $E$ = Electrode potential
- $E^0$ = Standard potential
- $Q$ = Reaction quotient
- $m$ = Number of HβΊ in reaction
7.4 Abrasive Particle Effects
7.4.1 Particle Size Distribution (PSD)
Log-normal distribution:
$$ f(d) = \frac{1}{d \sigma \sqrt{2\pi}} \exp\left(-\frac{(\ln d - \mu)^2}{2\sigma^2}\right) $$
Where:
- $d$ = Particle diameter
- $\mu$ = Mean of $\ln(d)$
- $\sigma$ = Standard deviation of $\ln(d)$
7.4.2 Zeta Potential
$$ \zeta = \frac{4\pi \eta \mu_e}{\varepsilon} $$
Where:
- $\eta$ = Viscosity
- $\mu_e$ = Electrophoretic mobility
- $\varepsilon$ = Dielectric constant
7.5 Slurry Components Summary
| Component | Function | Typical Materials |
|---|---|---|
| Abrasive | Mechanical removal | SiOβ, CeOβ, AlβOβ |
| Oxidizer | Surface modification | HβOβ, KIOβ, Fe(NOβ)β |
| Complexant | Metal dissolution | Glycine, citric acid |
| Inhibitor | Corrosion protection | BTA, BBI |
| Surfactant | Particle dispersion | CTAB, SDS |
| Buffer | pH control | Phosphate, citrate |
8. Chip-Scale and Full-Chip Models
8.1 Within-Wafer Non-Uniformity (WIWNU)
$$ WIWNU = \frac{\sigma_{thickness}}{\bar{thickness}} \times 100\% $$
Where:
- $\sigma_{thickness}$ = Standard deviation of thickness
- $\bar{thickness}$ = Mean thickness
8.2 Pressure Distribution Model
For a flexible carrier:
$$ P(r) = P_0 + \sum_{i=1}^{n} P_i \cdot J_0\left(\frac{\alpha_i r}{R}\right) $$
Where:
- $P_0$ = Base pressure
- $J_0$ = Bessel function of first kind
- $\alpha_i$ = Bessel zeros
- $R$ = Wafer radius
8.3 Multi-Zone Pressure Control
For zone $i$:
$$ MRR_i = k_p \cdot P_i \cdot v_i $$
Target uniformity achieved when:
$$ MRR_1 = MRR_2 = ... = MRR_n $$
8.4 Full-Chip Simulation Flow
-
βββββββββββββββββββββββ
β Design Layout (GDS)β
ββββββββββββ¬βββββββββββ
β
βββββββββββββββββββββββ
β Density Extraction β
β Ο(x,y) for each β
β metal/dielectric β
ββββββββββββ¬βββββββββββ
β
βββββββββββββββββββββββ
β Effective Density β
β Ο_eff = Ο * W β
ββββββββββββ¬βββββββββββ
β
βββββββββββββββββββββββ
β CMP Simulation β
β z(t) evolution β
ββββββββββββ¬βββββββββββ
β
βββββββββββββββββββββββ
β Post-CMP Topography β
β Dishing/Erosion Map β
ββββββββββββ¬βββββββββββ
β
βββββββββββββββββββββββ
β Hotspot Detection β
β Design Rule Check β
βββββββββββββββββββββββ
9. Process Control Applications
9.1 Run-to-Run (R2R) Control
9.1.1 EWMA Controller
$$ \hat{y}_{k+1} = \lambda y_k + (1 - \lambda) \hat{y}_k $$
Where:
- $\hat{y}_{k+1}$ = Predicted output for next run
- $y_k$ = Current measured output
- $\lambda$ = Smoothing factor $(0 < \lambda < 1)$
9.1.2 Recipe Adjustment
$$ u_{k+1} = u_k + G^{-1} (y_{target} - \hat{y}_{k+1}) $$
Where:
- $u$ = Process recipe (time, pressure, etc.)
- $G$ = Process gain matrix
- $y_{target}$ = Target output
9.2 Virtual Metrology
$$ \hat{y} = f_{VM}(\mathbf{x}_{FDC}) $$
Where:
- $\hat{y}$ = Predicted wafer quality
- $\mathbf{x}_{FDC}$ = Fault Detection and Classification sensor data
9.3 Endpoint Detection
9.3.1 Motor Current Monitoring
$$ I(t) = I_0 + \Delta I \cdot H(t - t_{endpoint}) $$
Where $H$ is the Heaviside step function.
9.3.2 Optical Endpoint
$$ R(\lambda, t) = R_{film}(\lambda, d(t)) $$
Where reflectance $R$ changes as film thickness $d$ decreases.
10. Current Challenges and Future Directions
10.1 Key Challenges
- Sub-5nm nodes: Atomic-scale precision required
- Thickness variation target: $< 5$ Γ (3Ο)
- Defect density target: $< 0.01$ defects/cmΒ²
- New materials integration:
- Low-ΞΊ dielectrics ($\kappa < 2.5$)
- Cobalt interconnects
- Ruthenium barrier layers
- 3D integration:
- Through-Silicon Via (TSV) CMP
- Hybrid bonding surface preparation
- Wafer-level packaging
10.2 Future Model Development
- Physics-informed neural networks (PINNs):
$$ \mathcal{L} = \mathcal{L}_{data} + \lambda_{physics} \cdot \mathcal{L}_{physics} $$
Where: $$ \mathcal{L}_{physics} = \left\| \frac{\partial z}{\partial t} + \frac{K}{\rho_{eff}} \right\|^2 $$
- Digital twins for real-time process optimization
- Federated learning across multiple fabs
10.3 Industry Requirements
| Node | Thickness Uniformity | Defect Density | Dishing Limit |
|---|---|---|---|
| 7nm | $< 10$ Γ | $< 0.05$/cmΒ² | $< 200$ Γ |
| 5nm | $< 7$ Γ | $< 0.03$/cmΒ² | $< 150$ Γ |
| 3nm | $< 5$ Γ | $< 0.01$/cmΒ² | $< 100$ Γ |
| 2nm | $< 3$ Γ | $< 0.005$/cmΒ² | $< 50$ Γ |
Symbol Glossary
| Symbol | Description | Units |
|---|---|---|
| $MRR$ | Material Removal Rate | nm/min |
| $k_p$ | Preston coefficient | mΒ²/N |
| $P$ | Pressure | Pa, psi |
| $v$ | Relative velocity | m/s |
| $\rho$ | Pattern density | dimensionless |
| $\rho_{eff}$ | Effective pattern density | dimensionless |
| $L$ | Planarization length | $\mu$m |
| $D$ | Dishing depth | Γ , nm |
| $E$ | Erosion depth | Γ , nm |
| $w$ | Feature width | nm, $\mu$m |
| $h$ | Step height | nm |
| $t$ | Polish time | s, min |
| $T$ | Temperature | K, Β°C |
| $\eta$ | Viscosity | Pa$\cdot$s |
| $\mu$ | Friction coefficient | dimensionless |
Key Equations
Preston Equation $$ MRR = k_p \cdot P \cdot v $$
Effective Density $$ \rho_{eff}(x,y) = \iint \rho_0(x',y') \cdot W(x-x', y-y') \, dx' dy' $$
Material Removal (Density Model) $$ \frac{dz}{dt} = -\frac{K}{\rho_{eff}(x,y)} $$
Dishing Model $$ D = D_0 \cdot \left(1 - e^{-w/w_c}\right) $$
Erosion Model $$ E = K_{ox} \cdot t_{over} \cdot \rho_{metal} $$
Neural Network $$ \hat{y} = \sigma(\mathbf{W}^{(n)} \cdot ... \cdot \sigma(\mathbf{W}^{(1)} \mathbf{x} + \mathbf{b}^{(1)}) + \mathbf{b}^{(n)}) $$
Source: ChipFoundryServices β Search this topic β Ask CFSGPT
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization β search the full knowledge base or chat with our AI assistant.