a/b testing for models,mlops
Deploy multiple model versions and compare performance.
186 technical terms and definitions
Deploy multiple model versions and compare performance.
ABC analysis categorizes inventory by value and usage prioritizing management attention on high-value items contributing most to costs.
Use ablation to generate activation maps.
Diffusion where tokens gradually become mask tokens.
Refuse to answer when uncertain.
Sound over-approximation of network behavior.
Adaptive communication-computation tradeoff.
Acid gas scrubbing neutralizes acidic vapors by contacting with alkaline solution.
Acid neutralization treats acidic waste streams by adding bases precipitating metals and adjusting pH before discharge.
Acid recovery systems regenerate and concentrate spent acids for reuse in processes.
Acoustic microscopy uses ultrasonic waves to detect delamination voids and cracks in packages through impedance variations at interfaces.
Detect delamination and voids using sound.
Action space specifies all possible actions or tools available to agents.
Action-conditional video generation synthesizes sequences based on specified actions or controls.
Technique to compress long context into compact representations.
Various non-linear activation functions.
Find inputs maximizing activations.
Optimize input to maximize specific activation.
Replace activations to test causality.
Edit internal activations to understand causal role of specific neurons or layers.
Active shift learns optimal shift directions and magnitudes for each layer.
Adaptive learning rate optimizer using momentum and RMSprop.
Adam with decoupled weight decay regularization.
Learn activation shapes.
Attacks tailored to specific defense.
Prevent overfitting in GANs.
Adaptive inference adjusts model capacity or computation based on input difficulty.
Control style via normalization.
Normalize and modulate with style.
Skip or repeat layers based on input complexity.
Adaptive generation of synthetic samples.
Additive Hawkes processes decompose intensity into baseline plus excitation from each past event.
Additive noise models assume effects are deterministic functions of causes plus independent noise enabling causal discovery.
Adjacency matrix encoding represents architecture graphs as matrices for graph neural network processing.
Predict Absorption Distribution Metabolism Excretion Toxicity.
Advanced composition provides tighter privacy bounds than basic composition.
Intel's chiplet interconnect.
Advanced oxidation processes generate hydroxyl radicals degrading persistent organic pollutants.
# Semiconductor Manufacturing: Advanced Mathematics
## 1. Lithography & Optical Physics
This is arguably the most mathematically demanding area of semiconductor manufacturing.
### 1.1 Fourier Optics & Partial Coherence Theory
The foundation of photolithography treats optical imaging as a spatial frequency filtering problem.
- **Key Concept**: The mask pattern is decomposed into spatial frequency components
- **Optical System**: Acts as a low-pass filter on spatial frequencies
- **Hopkins Formulation**: Describes partially coherent imaging
The aerial image intensity $I(x,y)$ is given by:
$$
I(x,y) = \iint\iint TCC(f_1, g_1, f_2, g_2) \cdot M(f_1, g_1) \cdot M^*(f_2, g_2) \cdot e^{2\pi i[(f_1-f_2)x + (g_1-g_2)y]} \, df_1 \, dg_1 \, df_2 \, dg_2
$$
Where:
- $TCC$ = Transmission Cross-Coefficient
- $M(f,g)$ = Mask spectrum (Fourier transform of mask pattern)
- $M^*$ = Complex conjugate of mask spectrum
**SOCS Decomposition** (Sum of Coherent Systems):
$$
TCC(f_1, g_1, f_2, g_2) = \sum_{k=1}^{N} \lambda_k \phi_k(f_1, g_1) \phi_k^*(f_2, g_2)
$$
- Eigenvalue decomposition makes computation tractable
- $\lambda_k$ are eigenvalues (typically only 10-20 terms needed)
- $\phi_k$ are eigenfunctions
### 1.2 Inverse Lithography Technology (ILT)
Given a desired wafer pattern $T(x,y)$, find the optimal mask $M(x,y)$.
**Mathematical Framework**:
- **Objective Function**:
$$
\min_{M} \left\| I[M](x,y) - T(x,y) \right\|^2 + \alpha R[M]
$$
- **Key Methods**:
- Variational calculus and gradient descent in function spaces
- Level-set methods for topology optimization:
$$
\frac{\partial \phi}{\partial t} + v|\nabla\phi| = 0
$$
- Tikhonov regularization: $R[M] = \|\nabla M\|^2$
- Total-variation regularization: $R[M] = \int |\nabla M| \, dx \, dy$
- Adjoint methods for efficient gradient computation
### 1.3 EUV & Rigorous Electromagnetics
At $\lambda = 13.5$ nm, scalar diffraction theory fails. Full vector Maxwell's equations are required.
**Maxwell's Equations** (time-harmonic form):
$$
\nabla \times \mathbf{E} = -i\omega\mu\mathbf{H}
$$
$$
\nabla \times \mathbf{H} = i\omega\varepsilon\mathbf{E}
$$
**Numerical Methods**:
- **RCWA** (Rigorous Coupled-Wave Analysis):
- Eigenvalue problem for each diffraction order
- Transfer matrix for multilayer stacks:
$$
\begin{pmatrix} E^+ \\ E^- \end{pmatrix}_{out} = \mathbf{T} \begin{pmatrix} E^+ \\ E^- \end{pmatrix}_{in}
$$
- **FDTD** (Finite-Difference Time-Domain):
- Yee grid discretization
- Leapfrog time integration:
$$
E^{n+1} = E^n + \frac{\Delta t}{\varepsilon} \nabla \times H^{n+1/2}
$$
- **Multilayer Thin-Film Optics**:
- Fresnel coefficients at each interface
- Transfer matrix method for $N$ layers
### 1.4 Aberration Theory
Optical aberrations characterized using **Zernike Polynomials**:
$$
W(\rho, \theta) = \sum_{n,m} Z_n^m R_n^m(\rho) \cdot
\begin{cases}
\cos(m\theta) & \text{(even)} \\
\sin(m\theta) & \text{(odd)}
\end{cases}
$$
Where $R_n^m(\rho)$ are radial polynomials:
$$
R_n^m(\rho) = \sum_{k=0}^{(n-m)/2} \frac{(-1)^k (n-k)!}{k! \left(\frac{n+m}{2}-k\right)! \left(\frac{n-m}{2}-k\right)!} \rho^{n-2k}
$$
**Common Aberrations**:
| Zernike Term | Name | Effect |
|--------------|------|--------|
| $Z_4^0$ | Defocus | Uniform blur |
| $Z_3^1$ | Coma | Asymmetric distortion |
| $Z_4^0$ | Spherical | Halo effect |
| $Z_2^2$ | Astigmatism | Directional blur |
## 2. Quantum Mechanics & Device Physics
As transistors reach sub-5nm dimensions, classical models break down.
### 2.1 Schrödinger Equation & Quantum Transport
**Time-Independent Schrödinger Equation**:
$$
\hat{H}\psi = E\psi
$$
$$
\left[-\frac{\hbar^2}{2m}\nabla^2 + V(\mathbf{r})\right]\psi(\mathbf{r}) = E\psi(\mathbf{r})
$$
**Non-Equilibrium Green's Function (NEGF) Formalism**:
- Retarded Green's function:
$$
G^R(E) = \left[(E + i\eta)I - H - \Sigma_L - \Sigma_R\right]^{-1}
$$
- Self-energy $\Sigma$ incorporates:
- Contact coupling
- Scattering mechanisms
- Electron-phonon interaction
- Current calculation:
$$
I = \frac{2e}{h} \int T(E) [f_L(E) - f_R(E)] \, dE
$$
- Transmission function:
$$
T(E) = \text{Tr}\left[\Gamma_L G^R \Gamma_R G^A\right]
$$
**Wigner Function** (bridging quantum and semiclassical):
$$
W(x,p) = \frac{1}{2\pi\hbar} \int \psi^*\left(x + \frac{y}{2}\right) \psi\left(x - \frac{y}{2}\right) e^{ipy/\hbar} \, dy
$$
### 2.2 Band Structure Theory
**k·p Perturbation Theory**:
$$
H_{k \cdot p} = \frac{p^2}{2m_0} + V(\mathbf{r}) + \frac{\hbar}{m_0}\mathbf{k} \cdot \mathbf{p} + \frac{\hbar^2 k^2}{2m_0}
$$
**Effective Mass Tensor**:
$$
\frac{1}{m^*_{ij}} = \frac{1}{\hbar^2} \frac{\partial^2 E}{\partial k_i \partial k_j}
$$
**Tight-Binding Hamiltonian**:
$$
H = \sum_i \varepsilon_i |i\rangle\langle i| + \sum_{\langle i,j \rangle} t_{ij} |i\rangle\langle j|
$$
- $\varepsilon_i$ = on-site energy
- $t_{ij}$ = hopping integral (Slater-Koster parameters)
### 2.3 Semiclassical Transport
**Boltzmann Transport Equation**:
$$
\frac{\partial f}{\partial t} + \mathbf{v} \cdot \nabla_r f + \frac{\mathbf{F}}{\hbar} \cdot \nabla_k f = \left(\frac{\partial f}{\partial t}\right)_{coll}
$$
- 6D phase space $(x, y, z, k_x, k_y, k_z)$
- Collision integral (scattering):
$$
\left(\frac{\partial f}{\partial t}\right)_{coll} = \sum_{k'} [S(k',k)f(k')(1-f(k)) - S(k,k')f(k)(1-f(k'))]
$$
**Drift-Diffusion Equations** (moment expansion):
$$
\mathbf{J}_n = q\mu_n n\mathbf{E} + qD_n\nabla n
$$
$$
\mathbf{J}_p = q\mu_p p\mathbf{E} - qD_p\nabla p
$$
## 3. Process Simulation PDEs
### 3.1 Dopant Diffusion
**Fick's Second Law** (concentration-dependent):
$$
\frac{\partial C}{\partial t} = \nabla \cdot (D(C,T) \nabla C) + G - R
$$
**Coupled Point-Defect System**:
$$
\begin{aligned}
\frac{\partial C_A}{\partial t} &= \nabla \cdot (D_A \nabla C_A) + k_{AI}C_AC_I - k_{AV}C_AC_V \\
\frac{\partial C_I}{\partial t} &= \nabla \cdot (D_I \nabla C_I) + G_I - k_{IV}C_IC_V \\
\frac{\partial C_V}{\partial t} &= \nabla \cdot (D_V \nabla C_V) + G_V - k_{IV}C_IC_V
\end{aligned}
$$
Where:
- $C_A$ = dopant concentration
- $C_I$ = interstitial concentration
- $C_V$ = vacancy concentration
- $k_{ij}$ = reaction rate constants
### 3.2 Oxidation & Film Growth
**Deal-Grove Model**:
$$
x_{ox}^2 + Ax_{ox} = B(t + \tau)
$$
- $A$ = linear rate constant (surface reaction limited)
- $B$ = parabolic rate constant (diffusion limited)
- $\tau$ = time offset for initial oxide
**Moving Boundary (Stefan) Problem**:
$$
D\frac{\partial C}{\partial x}\bigg|_{x=s(t)} = C^* \frac{ds}{dt}
$$
### 3.3 Ion Implantation
**Binary Collision Approximation** (Monte Carlo):
- Screened Coulomb potential:
$$
V(r) = \frac{Z_1 Z_2 e^2}{r} \phi\left(\frac{r}{a}\right)
$$
- Scattering angle from two-body collision integral
**As-Implanted Profile** (Pearson IV distribution):
$$
f(x) = f_0 \left[1 + \left(\frac{x-R_p}{b}\right)^2\right]^{-m} \exp\left[-r \tan^{-1}\left(\frac{x-R_p}{b}\right)\right]
$$
Parameters: $R_p$ (projected range), $\Delta R_p$ (straggle), skewness, kurtosis
### 3.4 Plasma Etching
**Electron Energy Distribution** (Boltzmann equation):
$$
\frac{\partial f}{\partial t} + \mathbf{v} \cdot \nabla f - \frac{e\mathbf{E}}{m} \cdot \nabla_v f = C[f]
$$
**Child-Langmuir Law** (sheath ion flux):
$$
J = \frac{4\varepsilon_0}{9} \sqrt{\frac{2e}{M}} \frac{V^{3/2}}{d^2}
$$
### 3.5 Chemical-Mechanical Polishing (CMP)
**Preston Equation**:
$$
\frac{dh}{dt} = K_p \cdot P \cdot V
$$
- $K_p$ = Preston coefficient
- $P$ = local pressure
- $V$ = relative velocity
**Pattern-Density Dependent Model**:
$$
P_{local} = P_{avg} \cdot \frac{A_{total}}{A_{contact}(\rho)}
$$
## 4. Electromagnetic Simulation
### 4.1 Interconnect Modeling
**Capacitance Extraction** (Laplace equation):
$$
\nabla^2 \phi = 0 \quad \text{(dielectric regions)}
$$
$$
\nabla \cdot (\varepsilon \nabla \phi) = -\rho \quad \text{(with charges)}
$$
**Boundary Element Method**:
$$
c(\mathbf{r})\phi(\mathbf{r}) = \int_S \left[\phi(\mathbf{r}') \frac{\partial G}{\partial n'} - G(\mathbf{r}, \mathbf{r}') \frac{\partial \phi}{\partial n'}\right] dS'
$$
Where $G(\mathbf{r}, \mathbf{r}') = \frac{1}{4\pi|\mathbf{r} - \mathbf{r}'|}$ (free-space Green's function)
### 4.2 Partial Inductance
**PEEC Method** (Partial Element Equivalent Circuit):
$$
L_{p,ij} = \frac{\mu_0}{4\pi} \frac{1}{a_i a_j} \int_{V_i} \int_{V_j} \frac{d\mathbf{l}_i \cdot d\mathbf{l}_j}{|\mathbf{r}_i - \mathbf{r}_j|}
$$
## 5. Statistical & Stochastic Methods
### 5.1 Process Variability
**Multivariate Gaussian Model**:
$$
p(\mathbf{x}) = \frac{1}{(2\pi)^{n/2}|\Sigma|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^T \Sigma^{-1} (\mathbf{x}-\boldsymbol{\mu})\right)
$$
**Principal Component Analysis**:
$$
\mathbf{X} = \mathbf{U}\mathbf{S}\mathbf{V}^T
$$
- Transform to uncorrelated variables
- Dimensionality reduction: retain components with largest singular values
**Polynomial Chaos Expansion**:
$$
Y(\boldsymbol{\xi}) = \sum_{k=0}^{P} y_k \Psi_k(\boldsymbol{\xi})
$$
- $\Psi_k$ = orthogonal polynomial basis (Hermite for Gaussian inputs)
- Enables uncertainty quantification without Monte Carlo
### 5.2 Yield Modeling
**Poisson Defect Model**:
$$
Y = e^{-D \cdot A}
$$
- $D$ = defect density (defects/cm²)
- $A$ = critical area
**Negative Binomial** (clustered defects):
$$
Y = \left(1 + \frac{DA}{\alpha}\right)^{-\alpha}
$$
### 5.3 Reliability Physics
**Weibull Distribution** (lifetime):
$$
F(t) = 1 - \exp\left[-\left(\frac{t}{\eta}\right)^\beta\right]
$$
- $\eta$ = scale parameter (characteristic life)
- $\beta$ = shape parameter (failure mode indicator)
**Black's Equation** (electromigration):
$$
MTTF = A \cdot J^{-n} \cdot \exp\left(\frac{E_a}{k_B T}\right)
$$
## 6. Optimization & Inverse Problems
### 6.1 Design of Experiments
**Response Surface Methodology**:
$$
y = \beta_0 + \sum_i \beta_i x_i + \sum_i \beta_{ii} x_i^2 + \sum_{i
Use adversarial examples to probe model understanding.
Inputs designed to fool the model into wrong predictions.
GAN-style discriminator loss.
Maximum allowed perturbation size.
Adversarial prompts attempt to elicit undesired behaviors testing model robustness.
Test model robustness with adversarial inputs.
Measure resilience to adversarial attacks.
Append carefully crafted text to jailbreak model.
Adversarial training improves robustness by augmenting training with adversarial examples.
Train on adversarial examples to improve robustness.
Train on adversarial examples.