← Back to AI Factory Chat

AI Factory Glossary

186 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 2 of 4 (186 entries)

adversarial training,ai safety

Train on adversarial examples to make model more robust.

adversarial training,robust,defense

Adversarial training includes adversarial examples in training. Makes models more robust. Expensive.

adversarial weight perturbation, awp, ai safety

Perturb weights for robustness.

adverse event detection, healthcare ai

Find medication side effects in text.

aft, aft, llm architecture

Attention Free Transformer uses element-wise operations instead of attention.

agent approval, ai agents

Agent approval requires human confirmation before executing high-stakes actions.

agent benchmarking, ai agents

Agent benchmarking evaluates performance across standardized task suites.

agent communication, ai agents

Agent communication protocols enable information exchange and coordination between agents.

agent debugging, ai agents

Agent debugging identifies and resolves issues in planning and execution logic.

agent feedback loop, ai agents

Feedback loops allow humans to correct and guide agent behavior iteratively.

agent handoff, ai agents

Agent handoff transfers responsibility for tasks between agents smoothly.

agent logging, ai agents

Agent logging records decisions actions and reasoning for debugging and auditing.

agent loop, ai agents

Agent loops repeatedly observe plan act and update until objectives are achieved.

agent memory, ai agents

Agent memory maintains conversation history observations and learned information across interactions.

agent negotiation, ai agents

Agent negotiation resolves conflicts through offers counteroffers and compromise.

agent protocol, ai agents

Agent protocols standardize interfaces for agent interoperability.

agent stopping criteria, ai agents

Stopping criteria define conditions when agents should terminate execution.

agent-based modeling, digital manufacturing

Model fab using autonomous agents.

agentbench, ai agents

AgentBench provides comprehensive evaluation framework for LLM-based agents.

aggregate functions, graph neural networks

Aggregate functions in GNNs combine neighbor information using operations like sum mean max or attention.

ai act,regulation,eu

EU AI Act regulates AI by risk level. High-risk requires compliance. Global regulatory trend.

ai bill of rights,ethics

Framework for protecting people from algorithmic harm.

ai feedback, ai, training techniques

AI feedback uses model-generated evaluations to train or align other models.

ai supercomputers, ai, infrastructure

Purpose-built systems for AI training.

aider,pair,programming

Aider is AI pair programming in terminal. Edit files with LLM.

aims, aims, lithography

Tool for aerial image inspection.

air bearing table,metrology

Ultra-stable surface for metrology.

air changes per hour (ach),air changes per hour,ach,facility

Number of times cleanroom air is completely replaced per hour.

air gap,beol

Use air (k=1) as insulator between metal lines for lowest capacitance.

air shower,facility

Enclosed space that blows high-velocity air to remove particles before cleanroom entry.

airborne molecular contamination, amc, contamination

Gaseous contaminants.

airflow,orchestration,dag

Apache Airflow orchestrates data pipelines. DAGs define dependencies. Standard for ETL.

airgap, process integration

Airgaps introduce air k equals one between metal lines providing the lowest possible dielectric constant reducing capacitance and crosstalk.

airl, adversarial inverse reinforcement learning, inverse rl, imitation learning, reward recovery, expert demonstrations, adversarial training

# Adversarial Inverse Reinforcement Learning (AIRL) ## AIRL **AIRL** (Adversarial Inverse Reinforcement Learning) is an advanced algorithm that combines inverse reinforcement learning with adversarial training to recover reward functions from expert demonstrations. ## The Core Problem AIRL Solves Traditional **Inverse Reinforcement Learning (IRL)** aims to recover a reward function from expert demonstrations. The fundamental challenges include: - **Reward ambiguity**: Many different reward functions can explain the same observed behavior - **Computational expense**: Requires solving an RL problem in an inner loop - **Poor scalability**: Struggles with high-dimensional problems - **Dynamics dependence**: Learned rewards often don't transfer to new environments ## Mathematical Formulation ### Discriminator Architecture The discriminator in AIRL has a specifically structured form: $$ D_\theta(s, a, s') = \frac{\exp(f_\theta(s, a, s'))}{\exp(f_\theta(s, a, s')) + \pi(a|s)} $$ Where: - $s$ = current state - $a$ = action taken - $s'$ = next state - $\pi(a|s)$ = policy probability - $f_\theta$ = learned function (detailed below) ### Reward-Shaping Decomposition The function $f_\theta$ is decomposed as: $$ f_\theta(s, a, s') = g_\theta(s, a) + \gamma h_\phi(s') - h_\phi(s) $$ | Component | Description | Role | |-----------|-------------|------| | $g_\theta(s, a)$ | Reward approximator | Transferable reward signal | | $h_\phi(s)$ | Shaping potential | Captures dynamics-dependent info | | $\gamma$ | Discount factor | Temporal discounting (typically 0.99) | ### State-Only Reward Variant For better transfer, use state-only rewards: $$ f_\theta(s, s') = g_\theta(s) + \gamma h_\phi(s') - h_\phi(s) $$ ## Training Algorithm ### Objective Functions **Discriminator Loss** (minimize): $$ \mathcal{L}_D = -\mathbb{E}_{\tau_E}\left[\log D_\theta(s, a, s')\right] - \mathbb{E}_{\tau_\pi}\left[\log(1 - D_\theta(s, a, s'))\right] $$ Where: - $\tau_E$ = expert trajectories - $\tau_\pi$ = policy-generated trajectories **Generator (Policy) Objective** (maximize): $$ \mathcal{L}_\pi = \mathbb{E}_{\tau_\pi}\left[\sum_{t=0}^{T} \gamma^t \log D_\theta(s_t, a_t, s_{t+1})\right] $$ ### Training Loop Pseudocode ```python AIRL Training Loop for iteration in range(max_iterations): Step 1: Sample trajectories from current policy policy_trajectories = sample_trajectories(policy, env, n_samples) Step 2: Update Discriminator for d_step in range(discriminator_steps): expert_batch = sample_batch(expert_demonstrations) policy_batch = sample_batch(policy_trajectories) Discriminator predictions D_expert = discriminator(expert_batch) D_policy = discriminator(policy_batch) Binary cross-entropy loss loss_D = -torch.mean(torch.log(D_expert)) \ -torch.mean(torch.log(1 - D_policy)) optimizer_D.zero_grad() loss_D.backward() optimizer_D.step() Step 3: Compute rewards for policy update rewards = torch.log(D_policy) - torch.log(1 - D_policy) Step 4: Update Policy (using PPO, TRPO, etc.) policy.update(policy_trajectories, rewards) ``` ## Theoretical Properties ### 1. Reward Recovery Guarantees At optimality, under ergodicity and sufficient expressiveness: $$ g_\theta(s, a) \rightarrow A^*(s, a) = Q^*(s, a) - V^*(s) $$ Or for state-only rewards: $$ g_\theta(s) \rightarrow r^*(s) $$ This recovers the **ground-truth reward** up to a constant. ### 2. Disentanglement Theorem The decomposition separates: $$ \underbrace{f_\theta(s, a, s')}_{\text{Full signal}} = \underbrace{g_\theta(s, a)}_{\text{Reward (transferable)}} + \underbrace{\gamma h_\phi(s') - h_\phi(s)}_{\text{Shaping (dynamics-dependent)}} $$ **Key insight**: Potential-based shaping ($\gamma h(s') - h(s)$) does not change the optimal policy, so $g_\theta$ captures the "true" reward. ### 3. Connection to Maximum Entropy IRL AIRL approximates MaxEnt IRL: $$ \max_\theta \mathbb{E}_{\tau_E}\left[\sum_t r_\theta(s_t, a_t)\right] + \mathcal{H}(\pi) $$ Where $\mathcal{H}(\pi)$ is the policy entropy. AIRL achieves this without the expensive inner-loop policy optimization. ## Comparison | Method | Recovers Reward | Dynamics-Invariant | Scalable | Sample Efficiency | |--------|-----------------|-------------------|----------|-------------------| | Behavioral Cloning | ❌ No | N/A | ✅ Yes | ✅ High | | GAIL | ❌ No (policy only) | ❌ No | ✅ Yes | ⚠️ Medium | | MaxEnt IRL | ✅ Yes | ⚠️ Partially | ❌ No | ❌ Low | | **AIRL** | ✅ **Yes** | ✅ **Yes** | ✅ **Yes** | ⚠️ Medium | ### GAIL vs AIRL **GAIL Discriminator**: $$ D_\theta^{GAIL}(s, a) = \sigma(f_\theta(s, a)) $$ **AIRL Discriminator**: $$ D_\theta^{AIRL}(s, a, s') = \frac{\exp(f_\theta(s, a, s'))}{\exp(f_\theta(s, a, s')) + \pi(a|s)} $$ The key difference: AIRL's structure enables reward recovery; GAIL's does not. ## Implementation Details ### Network Architecture ```python import torch import torch.nn as nn class AIRLDiscriminator(nn.Module): """ AIRL Discriminator with reward-shaping decomposition. """ def __init__(self, state_dim, action_dim, hidden_dim=256, gamma=0.99, state_only=True): super().__init__() self.gamma = gamma self.state_only = state_only Reward network g(s) or g(s,a) if state_only: self.g_net = nn.Sequential( nn.Linear(state_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, 1) ) else: self.g_net = nn.Sequential( nn.Linear(state_dim + action_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, 1) ) Shaping potential h(s) self.h_net = nn.Sequential( nn.Linear(state_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, 1) ) def get_reward(self, states, actions=None): """Extract the learned reward g(s) or g(s,a).""" if self.state_only: return self.g_net(states) else: sa = torch.cat([states, actions], dim=-1) return self.g_net(sa) def forward(self, states, actions, next_states, log_pi, dones): """ Compute f(s,a,s') = g(s,a) + gamma*h(s') - h(s) Args: states: Current states [batch, state_dim] actions: Actions taken [batch, action_dim] next_states: Next states [batch, state_dim] log_pi: Log probability of actions [batch, 1] dones: Episode termination flags [batch, 1] Returns: D(s,a,s'): Discriminator output [batch, 1] """ Reward component g = self.get_reward(states, actions) Shaping component h_s = self.h_net(states) h_s_next = self.h_net(next_states) f = g + gamma*h(s') - h(s), with masking for terminal states shaping = self.gamma * (1 - dones) * h_s_next - h_s f = g + shaping D(s,a,s') = exp(f) / (exp(f) + pi(a|s)) In log space: D = sigmoid(f - log_pi) log_D = f - log_pi D = torch.sigmoid(log_D) return D, f, g ``` ### Hyperparameters ```python Recommended hyperparameters for AIRL config = { Environment "gamma": 0.99, # Discount factor Networks "hidden_dim": 256, # Hidden layer size "n_hidden_layers": 2, # Number of hidden layers "state_only_reward": True, # Use g(s) instead of g(s,a) Training "batch_size": 256, # Batch size for updates "discriminator_lr": 3e-4, # Discriminator learning rate "policy_lr": 3e-4, # Policy learning rate "discriminator_steps": 1, # D updates per policy update Regularization "gradient_penalty_coef": 10.0, # Gradient penalty (optional) "entropy_coef": 0.01, # Policy entropy bonus Data "n_expert_trajectories": 50, # Number of expert demos "samples_per_iteration": 2048, # Policy samples per iteration } ``` ## Practical Considerations ### Advantages - **Reward transfer**: Learned $g_\theta$ transfers to new dynamics - **Interpretability**: Explicit reward function for analysis - **Data efficiency**: Better than BC with limited demonstrations - **Theoretical grounding**: Provable reward recovery guarantees ### Challenges - **Training instability**: GAN-like adversarial dynamics - **Hyperparameter sensitivity**: Requires careful tuning - **Discriminator overfitting**: Can memorize expert data - **Absorbing states**: Terminal states need special handling ### Stability Tricks ```python 1. Gradient Penalty (from WGAN-GP) def gradient_penalty(discriminator, expert_data, policy_data): alpha = torch.rand(expert_data.size(0), 1) interpolated = alpha * expert_data + (1 - alpha) * policy_data interpolated.requires_grad_(True) d_interpolated = discriminator(interpolated) gradients = torch.autograd.grad( outputs=d_interpolated, inputs=interpolated, grad_outputs=torch.ones_like(d_interpolated), create_graph=True )[0] gradient_norm = gradients.norm(2, dim=1) penalty = ((gradient_norm - 1) ** 2).mean() return penalty 2. Spectral Normalization from torch.nn.utils import spectral_norm layer = spectral_norm(nn.Linear(256, 256)) 3. Label Smoothing expert_labels = 0.9 # Instead of 1.0 policy_labels = 0.1 # Instead of 0.0 ``` ## Extensions and Variants ### 1. FAIRL (Forward Adversarial IRL) Corrects for state distribution shift: $$ r_{FAIRL}(s, a) = r_{AIRL}(s, a) - \log \pi(a|s) $$ ### 2. Off-Policy AIRL Uses replay buffer for sample efficiency: $$ \mathcal{L}_D = -\mathbb{E}_{\tau_E}[\log D] - \mathbb{E}_{\mathcal{B}}[\rho(s,a) \log(1-D)] $$ Where $\rho(s,a)$ is an importance weight. ### 3. Multi-Task AIRL Learns shared reward structure across tasks: $$ g_\theta(s, a) = g_{shared}(s, a) + g_{task}(s, a) $$ ## When to Use AIRL ### Good Fit ✅ - Need the **reward function**, not just the policy - Want to **transfer behavior** to different dynamics - Have **limited but high-quality** demonstrations - **Interpretability** of learned behavior matters ### Consider Alternatives - Only need to **match behavior** → Use GAIL (simpler) - Have **abundant demonstrations** → BC might suffice - **Reward function is known** → Use standard RL - Need **real-time performance** → BC is faster ## AIRL AIRL provides a principled approach to learning **transferable reward functions** from demonstrations by: 1. Using a **structured discriminator** that separates reward from dynamics 2. Leveraging **adversarial training** for scalability 3. Providing **theoretical guarantees** on reward recovery 4. Enabling **reward transfer** across different environments The key equation to remember: $$ \boxed{f_\theta(s, a, s') = g_\theta(s, a) + \gamma h_\phi(s') - h_\phi(s)} $$ Where $g_\theta$ is your transferable reward signal.

albert,foundation model

Lighter BERT using parameter sharing and factorization.

aleatoric uncertainty, ai safety

Aleatoric uncertainty comes from inherent randomness irreducible even with perfect knowledge.

aleatoric uncertainty,ai safety

Inherent randomness in data.

alias-free gan, multimodal ai

Alias-free GANs eliminate coordinate-dependent artifacts through continuous signal processing.

alibi (attention with linear biases),alibi,attention with linear biases,transformer

Simple relative position encoding.

all-reduce operation, distributed training

Efficiently aggregate across nodes.

all-to-all communication, distributed training

Exchange data between all devices.

allegro, chemistry ai

Fast equivariant neural network.

allegro, graph neural networks

Allegro achieves fast equivariant message passing through strict locality and efficient tensor operations.

alpaca, training techniques

Alpaca demonstrates instruction-following through distillation from stronger models.

alphacode,code ai

DeepMind's competitive programming model.

alphafold,healthcare ai

DeepMind's protein structure prediction system.

altair,declarative,visualization

Altair is declarative visualization. Vega-Lite based.

alternative chemistries, environmental & sustainability

Alternative chemistries develop less hazardous process chemicals maintaining performance while reducing environmental impact.

aluminum etch,al metal etch,aluminum metal etch modeling,al etch modeling,aluminum chlorine etch,alcl3,metal etch plasma,aluminum plasma etch,bcl3 etch

# Aluminum Metal Etch Mathematical Modeling 1. Overview 1.1 Why Aluminum Etch Modeling is Complex Aluminum etching (typically using $\text{Cl}_2/\text{BCl}_3$ plasmas) involves multiple coupled physical and chemical phenomena: - Plasma generation and transport → determines species fluxes to wafer - Ion-surface interactions → physical and chemical mechanisms - Surface reactions → Langmuir-Hinshelwood kinetics - Feature-scale evolution → profile development inside trenches/vias - Redeposition and passivation → sidewall chemistry 1.2 Fundamental Reaction The basic aluminum chlorination reaction: $$ \text{Al} + 3\text{Cl} \rightarrow \text{AlCl}_3 \uparrow $$ Complications requiring sophisticated modeling: - Breaking through native $\text{Al}_2\text{O}_3$ layer (15-30 Å) - Maintaining profile anisotropy - Controlling selectivity to mask and underlayers - Managing Cu residues in Al-Cu alloys 2. Kinetic and Chemical Rate Modeling 2.1 General Etch Rate Formulation A comprehensive etch rate model combines three primary mechanisms: $$ ER = \underbrace{k_{th} \cdot \Gamma_{Cl} \cdot f(\theta)}_{\text{thermal chemical}} + \underbrace{Y_s \cdot \Gamma_{ion} \cdot \sqrt{E_{ion}}}_{\text{physical sputtering}} + \underbrace{\beta \cdot \Gamma_{ion}^a \cdot \Gamma_{Cl}^b \cdot E_{ion}^c}_{\text{ion-enhanced (synergistic)}} $$ Parameter Definitions: | Symbol | Description | Units | |--------|-------------|-------| | $\Gamma_{Cl}$ | Neutral chlorine flux | $\text{cm}^{-2}\text{s}^{-1}$ | | $\Gamma_{ion}$ | Ion flux | $\text{cm}^{-2}\text{s}^{-1}$ | | $E_{ion}$ | Ion energy | eV | | $\theta$ | Surface coverage of reactive species | dimensionless | | $Y_s$ | Physical sputtering yield | atoms/ion | | $\beta$ | Synergy coefficient | varies | | $a, b, c$ | Exponents (typically 0.5-1) | dimensionless | 2.2 Surface Coverage Dynamics The reactive site balance follows Langmuir-Hinshelwood kinetics: $$ \frac{d\theta}{dt} = k_{ads} \cdot \Gamma_{Cl} \cdot (1-\theta) - k_{des} \cdot \theta \cdot \exp\left(-\frac{E_d}{k_B T}\right) - Y_{react}(\theta, E_{ion}) \cdot \Gamma_{ion} \cdot \theta $$ Term-by-term breakdown: - Term 1: $k_{ads} \cdot \Gamma_{Cl} \cdot (1-\theta)$ — Adsorption rate (proportional to empty sites) - Term 2: $k_{des} \cdot \theta \cdot \exp(-E_d/k_B T)$ — Thermal desorption (Arrhenius) - Term 3: $Y_{react} \cdot \Gamma_{ion} \cdot \theta$ — Ion-induced reaction/removal Steady-State Solution ($d\theta/dt = 0$): $$ \theta_{ss} = \frac{k_{ads} \cdot \Gamma_{Cl}}{k_{ads} \cdot \Gamma_{Cl} + k_{des} \cdot e^{-E_d/k_B T} + Y_{react} \cdot \Gamma_{ion}} $$ 2.3 Temperature Dependence All rate constants follow Arrhenius behavior: $$ k_i(T) = A_i \cdot \exp\left(-\frac{E_{a,i}}{k_B T}\right) $$ Typical activation energies for aluminum etching: - Ion-enhanced reactions: $E_a \approx 0.1 - 0.3 \text{ eV}$ - Purely thermal processes: $E_a \approx 0.5 - 1.0 \text{ eV}$ - Chlorine desorption: $E_d \approx 0.3 - 0.5 \text{ eV}$ 2.4 Complete Etch Rate Expression Combining all terms with explicit dependencies: $$ ER(T, \Gamma_{ion}, \Gamma_{Cl}, E_{ion}) = A_1 e^{-E_1/k_B T} \Gamma_{Cl} \theta + Y_0 \Gamma_{ion} \sqrt{E_{ion}} + A_2 e^{-E_2/k_B T} \Gamma_{ion}^{0.5} \Gamma_{Cl}^{0.5} E_{ion}^{0.5} $$ 3. Ion-Surface Interaction Physics 3.1 Ion Energy Distribution Function (IEDF) For RF-biased electrodes, the IEDF is approximately bimodal: $$ f(E) \propto \frac{1}{\sqrt{|E - E_{dc}|}} \quad \text{for } E_{dc} - E_{rf} < E < E_{dc} + E_{rf} $$ Key parameters: - $E_{dc} = e \cdot V_{dc}$ — DC self-bias energy - $E_{rf} = e \cdot V_{rf}$ — RF amplitude energy - Peak separation: $\Delta E = 2 E_{rf}$ Collisional effects: In collisional sheaths, charge-exchange collisions broaden the distribution: $$ f(E) \propto \exp\left(-\frac{E}{\bar{E}}\right) \cdot \left[1 + \text{erf}\left(\frac{E - E_{dc}}{\sigma_E}\right)\right] $$ 3.2 Ion Angular Distribution Function (IADF) The angular spread is approximately Gaussian: $$ f(\theta) = \frac{1}{\sqrt{2\pi}\sigma_\theta} \exp\left(-\frac{\theta^2}{2\sigma_\theta^2}\right) $$ Angular spread calculation: $$ \sigma_\theta \approx \sqrt{\frac{k_B T_i}{e V_{sheath}}} \approx \arctan\left(\sqrt{\frac{T_i}{V_{sheath}}}\right) $$ Typical values: - Ion temperature: $T_i \approx 0.05 - 0.5 \text{ eV}$ - Sheath voltage: $V_{sheath} \approx 50 - 500 \text{ V}$ - Angular spread: $\sigma_\theta \approx 2° - 5°$ 3.3 Physical Sputtering Yield Yamamura Formula (Angular Dependence) $$ Y(\theta) = Y(0°) \cdot \cos^{-f}(\theta) \cdot \exp\left[b\left(1 - \frac{1}{\cos\theta}\right)\right] $$ Parameters for aluminum: - $f \approx 1.5 - 2.0$ - $b \approx 0.1 - 0.3$ (depends on ion/target mass ratio) - Maximum yield typically at $\theta \approx 60° - 70°$ Sigmund Theory (Energy Dependence) $$ Y(E) = \frac{0.042 \cdot Q \cdot \alpha(M_2/M_1) \cdot S_n(E)}{U_s} $$ Where: - $S_n(E)$ = nuclear stopping power (Thomas-Fermi) - $U_s = 3.4 \text{ eV}$ (surface binding energy for Al) - $Q$ = dimensionless factor ($\approx 1$ for metals) - $\alpha$ = mass-dependent parameter - $M_1, M_2$ = projectile and target masses Nuclear Stopping Power $$ S_n(\epsilon) = \frac{0.5 \ln(1 + 1.2288\epsilon)}{\epsilon + 0.1728\sqrt{\epsilon} + 0.008\epsilon^{0.1504}} $$ With reduced energy: $$ \epsilon = \frac{M_2 E}{(M_1 + M_2) Z_1 Z_2 e^2} \cdot \frac{a_{TF}}{1} $$ 3.4 Ion-Enhanced Etching Yield The total etch yield combines mechanisms: $$ Y_{total} = Y_{physical} + Y_{chemical} + Y_{synergistic} $$ Synergistic enhancement factor: $$ \eta = \frac{Y_{total}}{Y_{physical} + Y_{chemical}} > 1 $$ For Al/Cl₂ systems, $\eta$ can exceed 10 under optimal conditions. 4. Plasma Modeling (Reactor Scale) 4.1 Species Continuity Equations For each species $i$ (electrons, ions, neutrals): $$ \frac{\partial n_i}{\partial t} + \nabla \cdot \vec{\Gamma}_i = S_i - L_i $$ Flux expressions: - Drift-diffusion: $\vec{\Gamma}_i = -D_i \nabla n_i + \mu_i n_i \vec{E}$ - Full momentum: $\vec{\Gamma}_i = n_i \vec{v}_i$ with momentum equation Source/sink terms: $$ S_i = \sum_j k_{ij} n_j n_e \quad \text{(ionization, dissociation)} $$ $$ L_i = \sum_j k_{ij}^{loss} n_i n_j \quad \text{(recombination, attachment)} $$ 4.2 Electron Energy Balance $$ \frac{\partial}{\partial t}\left(\frac{3}{2} n_e k_B T_e\right) + \nabla \cdot \vec{Q}_e = P_{abs} - P_{loss} $$ Heat flux: $$ \vec{Q}_e = \frac{5}{2} k_B T_e \vec{\Gamma}_e - \kappa_e \nabla T_e $$ Power absorption (ICP): $$ P_{abs} = \frac{1}{2} \text{Re}(\sigma_p) |E|^2 $$ Collisional losses: $$ P_{loss} = \sum_j n_e n_j k_j \varepsilon_j $$ Where $\varepsilon_j$ is the energy loss per collision event $j$. 4.3 Plasma Conductivity $$ \sigma_p = \frac{n_e e^2}{m_e(\nu_m + i\omega)} $$ Skin depth: $$ \delta = \sqrt{\frac{2}{\omega \mu_0 \text{Re}(\sigma_p)}} $$ 4.4 Electromagnetic Field Equations Maxwell's equations (frequency domain): $$ \nabla \times \vec{E} = -i\omega \vec{B} $$ $$ \nabla \times \vec{B} = \mu_0 \sigma_p \vec{E} + i\omega \mu_0 \epsilon_0 \vec{E} $$ Wave equation: $$ \nabla^2 \vec{E} + \left(\frac{\omega^2}{c^2} - i\omega\mu_0\sigma_p\right)\vec{E} = 0 $$ 4.5 Sheath Physics Child-Langmuir Law (Collisionless Sheath) $$ J_{ion} = \frac{4\epsilon_0}{9}\sqrt{\frac{2e}{M}} \cdot \frac{V_s^{3/2}}{s^2} $$ Where: - $J_{ion}$ = ion current density - $V_s$ = sheath voltage - $s$ = sheath thickness - $M$ = ion mass Bohm Criterion Ions must enter sheath with velocity: $$ v_{Bohm} = \sqrt{\frac{k_B T_e}{M}} $$ Ion flux at sheath edge: $$ \Gamma_{ion} = n_s \cdot v_{Bohm} = 0.61 \cdot n_0 \sqrt{\frac{k_B T_e}{M}} $$ Sheath Thickness $$ s \approx \lambda_D \cdot \left(\frac{2 e V_s}{k_B T_e}\right)^{3/4} $$ Debye length: $$ \lambda_D = \sqrt{\frac{\epsilon_0 k_B T_e}{n_e e^2}} $$ 5. Feature-Scale Profile Evolution 5.1 Level Set Method The surface is represented implicitly by $\phi(\vec{r}, t) = 0$: $$ \frac{\partial \phi}{\partial t} + V_n |\nabla \phi| = 0 $$ Normal velocity calculation: $$ V_n(\vec{r}) = \int_0^{E_{max}} \int_0^{\theta_{max}} Y(E, \theta_{local}) \cdot f_{IEDF}(E) \cdot f_{IADF}(\theta) \cdot \Gamma_{ion}(\vec{r}) \, dE \, d\theta $$ Plus contributions from: - Neutral chemical etching - Redeposition - Surface diffusion 5.2 Hamilton-Jacobi Formulation $$ \frac{\partial \phi}{\partial t} + H(\nabla \phi, \vec{r}, t) = 0 $$ Hamiltonian for etch: $$ H = V_n \sqrt{\phi_x^2 + \phi_y^2 + \phi_z^2} $$ With $V_n$ dependent on: - Local surface normal: $\hat{n} = -\nabla\phi / |\nabla\phi|$ - Local fluxes: $\Gamma(\vec{r})$ - Local angles: $\theta = \arccos(\hat{n} \cdot \hat{z})$ 5.3 Visibility and View Factors Direct Flux The flux reaching a point inside a feature depends on solid angle visibility: $$ \Gamma_{direct}(\vec{r}) = \int_{\Omega_{visible}} \Gamma_0 \cdot \cos\theta \cdot \frac{d\Omega}{\pi} $$ Reflected/Reemitted Flux For neutrals with sticking coefficient $s$: $$ \Gamma_{total}(\vec{r}) = \Gamma_{direct}(\vec{r}) + (1-s) \cdot \Gamma_{reflected}(\vec{r}) $$ This leads to coupled integral equations: $$ \Gamma(\vec{r}) = \Gamma_{plasma}(\vec{r}) + (1-s) \int_{S'} K(\vec{r}, \vec{r'}) \Gamma(\vec{r'}) dS' $$ Kernel function: $$ K(\vec{r}, \vec{r'}) = \frac{\cos\theta \cos\theta'}{\pi |\vec{r} - \vec{r'}|^2} \cdot V(\vec{r}, \vec{r'}) $$ Where $V(\vec{r}, \vec{r'})$ is the visibility function (1 if visible, 0 otherwise). 5.4 Aspect Ratio Dependent Etching (ARDE) Empirical model: $$ \frac{ER(AR)}{ER_0} = \frac{1}{1 + (AR/AR_c)^n} $$ Where: - $AR = \text{depth}/\text{width}$ (aspect ratio) - $AR_c$ = critical aspect ratio (process-dependent) - $n \approx 1 - 2$ Knudsen transport model: $$ \Gamma_{neutral}(z) = \Gamma_0 \cdot \frac{W}{W + \alpha \cdot z} $$ Where: - $z$ = feature depth - $W$ = feature width - $\alpha$ = Clausing factor (depends on geometry and sticking) Clausing factor for cylinder: $$ \alpha = \frac{8}{3} \cdot \frac{1 - s}{s} $$ 6. Aluminum-Specific Phenomena 6.1 Native Oxide Breakthrough $\text{Al}_2\text{O}_3$ (15-30 Å native oxide) requires physical sputtering: $$ ER_{oxide} \approx Y_{\text{BCl}_3^+}(E) \cdot \Gamma_{ion} $$ Why BCl₃ is critical: 1. Heavy $\text{BCl}_3^+$ ions provide efficient momentum transfer 2. BCl₃ scavenges oxygen chemically: $$ 2\text{BCl}_3 + \text{Al}_2\text{O}_3 \rightarrow 2\text{AlCl}_3 \uparrow + \text{B}_2\text{O}_3 $$ Breakthrough time: $$ t_{breakthrough} = \frac{d_{oxide}}{ER_{oxide}} = \frac{d_{oxide}}{Y_{BCl_3^+} \cdot \Gamma_{ion}} $$ 6.2 Sidewall Passivation Dynamics Anisotropic profiles require passivation of sidewalls: $$ \frac{d\tau_{pass}}{dt} = R_{dep}(\Gamma_{redeposition}, s_{stick}) - R_{removal}(\Gamma_{ion}, \theta_{sidewall}) $$ Deposition sources: - $\text{AlCl}_x$ redeposition from etch products - Photoresist erosion products (C, H, O, N) - Intentional additives: $\text{N}_2 \rightarrow \text{AlN}$ formation Why sidewalls are protected: At grazing incidence ($\theta \approx 85° - 90°$): - Ion flux geometric factor: $\Gamma_{sidewall} = \Gamma_0 \cdot \cos(90° - \alpha) \approx \Gamma_0 \cdot \sin\alpha$ - For $\alpha = 5°$: $\Gamma_{sidewall} \approx 0.09 \cdot \Gamma_0$ - Sputtering yield at grazing incidence approaches zero - Net passivation accumulates → blocks lateral etching 6.3 Notching and Charging Effects At dielectric interfaces, differential charging causes ion deflection: Surface charge evolution: $$ \frac{d\sigma}{dt} = J_{ion} - J_{electron} $$ Where: - $\sigma$ = surface charge density (C/cm²) - $J_{ion}$ = ion current (always positive) - $J_{electron}$ = electron current (depends on local potential) Local electric field: $$ \vec{E}_{charging} = -\nabla V_{charging} $$ Laplace equation in feature: $$ \nabla^2 V = -\frac{\rho}{\epsilon_0} \quad \text{(with } \rho = 0 \text{ in vacuum)} $$ Modified ion trajectory: $$ m \frac{d^2\vec{r}}{dt^2} = e\left(\vec{E}_{sheath} + \vec{E}_{charging}\right) $$ Result: Ions deflect toward charged surfaces → notching at feature bottom. Mitigation strategies: - Pulsed plasmas (allow electron neutralization) - Low-frequency bias (time for charge equilibration) - Conductive underlayers 6.4 Copper Residue Formation (Al-Cu Alloys) Al-Cu alloys (0.5-4% Cu) leave Cu residues because Cu chlorides are less volatile: Volatility comparison: | Species | Sublimation/Boiling Point | |---------|---------------------------| | $\text{AlCl}_3$ | 180°C (sublimes) | | $\text{CuCl}$ | 430°C (sublimes) | | $\text{CuCl}_2$ | 300°C (decomposes) | Residue accumulation rate: $$ \frac{d[\text{Cu}]_{surface}}{dt} = x_{Cu} \cdot ER_{Al} - ER_{Cu} $$ Where: - $x_{Cu}$ = Cu atomic fraction in alloy - At low temperature: $ER_{Cu} \ll x_{Cu} \cdot ER_{Al}$ Solutions: - Elevated substrate temperature ($>$150°C) - Increased BCl₃ fraction - Post-etch treatments 7. Numerical Methods 7.1 Level Set Discretization Upwind Finite Differences Using Hamilton-Jacobi ENO (Essentially Non-Oscillatory) schemes: $$ \phi_i^{n+1} = \phi_i^n - \Delta t \cdot H(\phi_x^-, \phi_x^+, \phi_y^-, \phi_y^+) $$ One-sided derivatives: $$ \phi_x^- = \frac{\phi_i - \phi_{i-1}}{\Delta x}, \quad \phi_x^+ = \frac{\phi_{i+1} - \phi_i}{\Delta x} $$ Godunov flux for $H = V_n |\nabla\phi|$: $$ H^{Godunov} = \begin{cases} V_n \sqrt{\max(\phi_x^{-,+},0)^2 + \max(\phi_y^{-,+},0)^2} & \text{if } V_n > 0 \\ V_n \sqrt{\max(\phi_x^{+,-},0)^2 + \max(\phi_y^{+,-},0)^2} & \text{if } V_n < 0 \end{cases} $$ Reinitialization Maintain $|\nabla\phi| = 1$ using: $$ \frac{\partial \phi}{\partial \tau} = \text{sign}(\phi_0)(1 - |\nabla\phi|) $$ Iterate in pseudo-time $\tau$ until convergence. 7.2 Monte Carlo Feature-Scale Simulation Algorithm: 1. INITIALIZE surface mesh 2. FOR each time step: a. FOR i = 1 to N_particles: - Sample particle from IEDF, IADF - Launch from plasma boundary - TRACE trajectory until surface hit - APPLY reaction probability: * Etch (remove cell) with probability P_etch * Reflect with probability P_reflect * Deposit with probability P_deposit b. UPDATE surface mesh c. CHECK for convergence 3. OUTPUT final profile Variance reduction techniques: - Importance sampling: Weight particles toward features of interest - Particle splitting: Increase statistics in critical regions - Russian roulette: Terminate low-weight particles probabilistically 7.3 Coupled Multi-Scale Modeling | Scale | Domain | Method | Outputs | |-------|--------|--------|---------| | Reactor | m | Fluid/hybrid plasma | $n_e$, $T_e$, species densities | | Sheath | mm | PIC or fluid | IEDF, IADF, fluxes | | Feature | nm-μm | Level set / Monte Carlo | Profile evolution | | Atomistic | Å | MD / DFT | Yields, sticking coefficients | Coupling strategy: $$ \text{Reactor} \xrightarrow{\Gamma_i, f(E), f(\theta)} \text{Feature} \xrightarrow{ER(\vec{r})} \text{Reactor} $$ 7.4 Plasma Solver Discretization Finite element for Poisson's equation: $$ \nabla \cdot (\epsilon \nabla V) = -\rho $$ Weak form: $$ \int_\Omega \epsilon \nabla V \cdot \nabla w \, d\Omega = \int_\Omega \rho \, w \, d\Omega $$ Finite volume for transport: $$ \frac{d(n_i V_j)}{dt} = -\sum_{faces} \Gamma_i \cdot \hat{n} \cdot A + S_i V_j $$ 8. Process Window and Optimization 8.1 Response Surface Modeling Quadratic response surface: $$ ER = \beta_0 + \sum_{i=1}^{k} \beta_i x_i + \sum_{i=1}^{k} \beta_{ii} x_i^2 + \sum_{i T_i \end{cases} $$ Optimization problem: $$ \max_{\vec{x}} D(\vec{x}) $$ Subject to: - $85° < \text{sidewall angle} < 90°$ - $\text{Selectivity}_{Al:resist} > 3:1$ - $\text{Selectivity}_{Al:TiN} > 10:1$ - $\text{Uniformity} < 3\%$ (1σ) 8.3 Virtual Metrology Prediction model: $$ \vec{y}_{etch} = f_{ML}\left(\vec{x}_{recipe}, \vec{x}_{OES}, \vec{x}_{chamber}\right) $$ Input features: - Recipe: Power, pressure, flows, time - OES: Emission line intensities (e.g., Al 396nm, Cl 837nm) - Chamber: Impedance, temperature, previous wafer history Machine learning approaches: - Neural networks (for complex nonlinear relationships) - Gaussian processes (with uncertainty quantification) - Partial least squares (for high-dimensional, correlated inputs) 8.4 Run-to-Run Control EWMA (Exponentially Weighted Moving Average) controller: $$ \vec{x}_{k+1} = \vec{x}_k + \Lambda G^{-1}(\vec{y}_{target} - \vec{y}_k) $$ Where: - $\Lambda$ = diagonal weighting matrix (0 < λ < 1) - $G$ = process gain matrix ($\partial y / \partial x$) Drift compensation: $$ \vec{x}_{k+1} = \vec{x}_k + \Lambda_1 G^{-1}(\vec{y}_{target} - \vec{y}_k) + \Lambda_2 (\vec{x}_{k} - \vec{x}_{k-1}) $$ 9. Equations: | Physics | Governing Equation | |---------|-------------------| | Etch rate | $ER = k\Gamma_{Cl}\theta + Y\Gamma_{ion}\sqrt{E} + \beta\Gamma_{ion}\Gamma_{Cl}E^c$ | | Surface coverage | $\theta = \dfrac{k_{ads}\Gamma}{k_{ads}\Gamma + k_{des}e^{-E_d/kT} + Y\Gamma_{ion}}$ | | Profile evolution | $\dfrac{\partial\phi}{\partial t} + V_n|\nabla\phi| = 0$ | | Ion flux (sheath) | $J_{ion} = \dfrac{4\epsilon_0}{9}\sqrt{\dfrac{2e}{M}} \cdot \dfrac{V^{3/2}}{s^2}$ | | ARDE | $\dfrac{ER(AR)}{ER_0} = \dfrac{1}{1 + (AR/AR_c)^n}$ | | View factor | $\Gamma(\vec{r}) = \displaystyle\int_{\Omega} \Gamma_0 \cos\theta \, \dfrac{d\Omega}{\pi}$ | | Sputtering yield | $Y(\theta) = Y_0 \cos^{-f}\theta \cdot \exp\left[b\left(1 - \dfrac{1}{\cos\theta}\right)\right]$ | | Species transport | $\dfrac{\partial n_i}{\partial t} + \nabla \cdot \vec{\Gamma}_i = S_i - L_i$ | 10. Modern Developments 10.1 Machine Learning Integration Applications: - Yield prediction: Neural networks trained on MD simulation data - Surrogate models: Replace expensive PDE solvers for real-time optimization - Process control: Reinforcement learning for adaptive recipes Example: Gaussian Process for Etch Rate: $$ ER(\vec{x}) \sim \mathcal{GP}\left(m(\vec{x}), k(\vec{x}, \vec{x}')\right) $$ With squared exponential kernel: $$ k(\vec{x}, \vec{x}') = \sigma_f^2 \exp\left(-\frac{|\vec{x} - \vec{x}'|^2}{2\ell^2}\right) $$ 10.2 Atomistic-Continuum Bridging ReaxFF molecular dynamics: - Reactive force fields for Al-Cl-O systems - Calculate fundamental yields and sticking coefficients - Feed into continuum models DFT calculations: - Adsorption energies: $E_{ads} = E_{surface+adsorbate} - E_{surface} - E_{adsorbate}$ - Activation barriers via NEB (Nudged Elastic Band) - Electronic structure effects on reactivity 10.3 Digital Twins Components: - Real-time sensor data ingestion - Physics-based + ML hybrid models - Predictive maintenance algorithms - Virtual process development Update equation: $$ \vec{\theta}_{model}^{(k+1)} = \vec{\theta}_{model}^{(k)} + K_k \left(\vec{y}_{measured} - \vec{y}_{predicted}\right) $$ 10.4 Uncertainty Quantification Bayesian calibration: $$ p(\vec{\theta}|\vec{y}) \propto p(\vec{y}|\vec{\theta}) \cdot p(\vec{\theta}) $$ Propagation through models: $$ \text{Var}(y) \approx \sum_i \left(\frac{\partial y}{\partial \theta_i}\right)^2 \text{Var}(\theta_i) $$ Monte Carlo uncertainty: $$ \bar{y} \pm t_{\alpha/2} \cdot \frac{s}{\sqrt{N}} $$ Physical Constants | Constant | Symbol | Value | |----------|--------|-------| | Boltzmann constant | $k_B$ | $1.381 \times 10^{-23}$ J/K | | Electron charge | $e$ | $1.602 \times 10^{-19}$ C | | Electron mass | $m_e$ | $9.109 \times 10^{-31}$ kg | | Permittivity of vacuum | $\epsilon_0$ | $8.854 \times 10^{-12}$ F/m | | Al atomic mass | $M_{Al}$ | 26.98 amu | | Al surface binding energy | $U_s$ | 3.4 eV | Process Conditions | Parameter | Typical Range | |-----------|---------------| | Pressure | 5-50 mTorr | | Source power (ICP) | 200-1000 W | | Bias power (RF) | 50-300 W | | Cl₂ flow | 20-100 sccm | | BCl₃ flow | 20-80 sccm | | Temperature | 20-80°C | | Etch rate | 300-800 nm/min |

always-on domain,design

Power domain that stays on.