bayesian optimization,prior,efficient
Bayesian optimization uses prior knowledge. Efficient search.
9,967 technical terms and definitions
Bayesian optimization uses prior knowledge. Efficient search.
Bayesian deep learning maintains uncertainty through posteriors. Expensive but principled.
Hardest BIG-bench tasks.
Measure social biases in question answering.
Question-answering bias test.
Behavior cloning regularization in offline RL constrains policy to stay close to data-generating distribution while improving performance.
Behavioral Cloning learns policies through supervised learning on state-action pairs from expert demonstrations without requiring reward signals.
Batch-Constrained Q-learning prevents extrapolation error in offline RL by constraining actions to those similar to the behavior policy in the dataset.
Maintain top-K sequences.
Keep track of top-k sequences and expand them to find high-probability outputs.
Beam search keeps top-k hypotheses. Sampling adds randomness. Trade-off between quality and diversity.
Beamforming combines multi-channel signals with spatial filtering to enhance target sound sources.
Bootstrapping Error Accumulation Reduction minimizes offline RL policy divergence by constraining learned policies to the support of the behavior policy.
Bed-of-nails fixtures use arrays of spring-loaded pins to contact PCB test points for in-circuit testing and fault isolation.
Before-after comparisons quantify improvement by measuring performance changes.
When you say explain like I am 5, I will use tiny steps, concrete examples, and avoid jargon, then we can gradually increase depth and formality.
Systematically test model capabilities.
Imitate expert demonstrations.
Test model behavior on synthetic data.
Masked image modeling for ViT.
Predict discrete visual tokens.
Standard datasets for comparing model performance (GLUE SuperGLUE MMLU).
MMLU tests knowledge, HumanEval tests coding, GSM8K tests math. Standard benchmarks to compare models.
Benchmarks provide standardized test suites for comparing model performance.
Benchmarking compares performance. Standard workloads. Identify regressions, compare systems.
I can design simple benchmarks to measure latency/throughput and interpret results to guide optimization decisions.
Compare technologies or designs.
Benefit realization confirms expected improvements materialize after implementation.
BentoML is framework-agnostic serving. Build, ship, scale.
Back-End-Of-Line metal stack consists of multiple metal layers and inter-layer dielectrics routing signals and distributing power.
# BEOL: Back End of Line in Semiconductor Manufacturing ## Overview BEOL (Back End of Line) refers to the second major phase of semiconductor wafer fabrication, occurring after the Front-End-of-Line (FEOL) processes are complete. BEOL focuses on creating the **interconnect structure** that electrically connects all transistors and devices. ## 1. Key Components of BEOL - **Metal Interconnect Layers** - Primary conductor: Copper (Cu) - Historical: Aluminum (Al) - Advanced nodes: Cobalt (Co), Ruthenium (Ru) for local interconnects - **Interlayer Dielectrics (ILD)** - Purpose: Electrical isolation between metal layers - Materials: $\text{SiO}_2$, low-$\kappa$ dielectrics - Target: $\kappa < 3.0$ for advanced nodes - **Vias** - Vertical electrical connections between metal layers - Filled with copper or alternative metals - **Contacts** - Connect first metal layer (M1) to transistor terminals - Critical interface: Metal-to-silicide contact ## 2. BEOL Process Flow ### 2.1 Dual Damascene Process ``` - ┌─────────────────────────────────────────────────────────┐ │ 1. Dielectric Deposition │ │ └── CVD or Spin-on Low-κ dielectric │ │ │ │ 2. Lithography │ │ └── Pattern trenches and via holes │ │ │ │ 3. Reactive Ion Etching (RIE) │ │ └── Etch dielectric to form trenches/vias │ │ │ │ 4. Barrier Layer Deposition │ │ └── TaN/Ta by PVD (~2-5 nm) │ │ │ │ 5. Copper Seed Layer │ │ └── PVD Cu (~50-100 nm) │ │ │ │ 6. Electrochemical Plating (ECP) │ │ └── Fill trenches with Cu │ │ │ │ 7. Chemical Mechanical Polishing (CMP) │ │ └── Planarize surface │ │ │ │ 8. Repeat for each metal layer (M1 → Mn) │ └─────────────────────────────────────────────────────────┘ ``` ### 2.2 Metal Layer Stack (Advanced Node Example) | Layer | Pitch (nm) | Primary Metal | Purpose | |-------|------------|---------------|---------| | M1-M2 | 20-28 | Cu/Co/Ru | Local interconnect | | M3-M5 | 32-40 | Cu | Intermediate routing | | M6-M10 | 40-80 | Cu | Semi-global routing | | M11-M15 | 80-160 | Cu | Global routing, power | ## 3. Critical Physics & Equations ### 3.1 RC Delay The interconnect delay is governed by the RC time constant: $$ \tau_{RC} = R \cdot C $$ Where: - $R$ = Line resistance ($\Omega$) - $C$ = Line capacitance (F) For a wire segment: $$ \tau_{RC} = \rho \cdot \frac{L}{A} \cdot \varepsilon_0 \cdot \kappa \cdot \frac{A_{cap}}{d} $$ Where: - $\rho$ = Resistivity ($\Omega \cdot \text{cm}$) - $L$ = Wire length - $A$ = Wire cross-sectional area - $\kappa$ = Dielectric constant - $d$ = Dielectric thickness ### 3.2 Copper Resistivity Scaling At nanoscale dimensions, resistivity increases due to surface and grain boundary scattering: $$ \rho_{eff} = \rho_{bulk} \left(1 + \frac{\lambda}{w} + \frac{\lambda}{h}\right) $$ Fuchs-Sondheimer model for surface scattering: $$ \frac{\rho}{\rho_0} = 1 + \frac{3}{8}(1-p)\frac{\lambda}{d} $$ Where: - $\rho_0$ = Bulk resistivity - $\lambda$ = Electron mean free path ($\approx 39 \text{ nm}$ for Cu at 300K) - $p$ = Specularity parameter (0 = diffuse, 1 = specular) - $d$ = Film thickness Mayadas-Shatzkes model for grain boundary scattering: $$ \frac{\rho}{\rho_0} = \left[1 - \frac{3}{2}\alpha + 3\alpha^2 - 3\alpha^3 \ln\left(1 + \frac{1}{\alpha}\right)\right]^{-1} $$ Where: $$ \alpha = \frac{\lambda}{g} \cdot \frac{R_g}{1 - R_g} $$ - $g$ = Average grain size - $R_g$ = Grain boundary reflection coefficient ### 3.3 Electromigration Black's equation for Mean Time To Failure (MTTF): $$ \text{MTTF} = A \cdot j^{-n} \cdot \exp\left(\frac{E_a}{k_B T}\right) $$ Where: - $A$ = Constant (material/process dependent) - $j$ = Current density ($\text{A/cm}^2$) - $n$ = Current exponent ($\approx 1-2$) - $E_a$ = Activation energy ($\approx 0.7-0.9 \text{ eV}$ for Cu) - $k_B$ = Boltzmann constant ($8.617 \times 10^{-5} \text{ eV/K}$) - $T$ = Temperature (K) ### 3.4 Capacitance Components Total line capacitance: $$ C_{total} = C_{line-to-line} + C_{line-to-ground} + C_{fringe} $$ Parallel plate approximation: $$ C = \varepsilon_0 \cdot \kappa \cdot \frac{A}{d} $$ Where: - $\varepsilon_0 = 8.854 \times 10^{-12} \text{ F/m}$ - $\kappa$ = Relative permittivity of dielectric ## 4. Dielectric Materials ### 4.1 Low-κ Dielectric Evolution | Generation | Material | $\kappa$ Value | Notes | |------------|----------|----------------|-------| | Traditional | $\text{SiO}_2$ | 3.9-4.2 | Baseline | | FSG | $\text{SiOF}$ | 3.5-3.7 | Fluorine-doped | | Low-κ | $\text{SiCOH}$ | 2.7-3.0 | Carbon-doped oxide | | ULK | Porous $\text{SiCOH}$ | 2.2-2.5 | Porosity-induced | | ELK | Porous $\text{SiCOH}$ | < 2.2 | Extreme low-κ | | Air Gap | Air ($\kappa = 1$) | ~1.5 effective | Selective dielectric removal | ### 4.2 Porosity and Dielectric Constant For porous dielectrics, the effective $\kappa$ follows: $$ \kappa_{eff} = \kappa_{matrix}(1 - P) + \kappa_{air} \cdot P $$ Where: - $P$ = Porosity fraction - $\kappa_{air} = 1.0$ ## 5. Advanced BEOL Challenges ### 5.1 Resistance Crisis at Advanced Nodes ``` Wire Width vs. Resistivity Increase ────────────────────────────────────────────────────── Resistivity │ (μΩ·cm) │ ● │ ● 10.0 ────────┼──────────────────────●─────── │ ● │ ● 5.0 ────────┼──────────●─────────────────── │ ● │ ● 2.0 ────────┼●───────────────────────────── ← Bulk Cu │ ────────┴────────────────────────────── 100 50 30 20 15 10 Wire Width (nm) ``` ### 5.2 Alternative Metals Comparison | Metal | Bulk ρ (μΩ·cm) | λ (nm) | Advantage at < 15 nm | |-------|---------------------|----------------|----------------------| | Cu | 1.68 | 39 | No (high scattering) | | Co | 6.24 | 11.8 | Moderate | | Ru | 7.1 | 6.6 | Yes (short $\lambda$) | | Mo | 5.3 | 11.2 | Yes (refractory) | | W | 5.3 | 15.5 | Moderate | Crossover point estimation: $$ w_{crossover} \approx \frac{\lambda_{Cu} \cdot \rho_{alt}}{\rho_{Cu} \cdot \lambda_{alt}} $$ ## 6. Emerging BEOL Technologies ### 6.1 Backside Power Delivery Network (BSPDN) - **Concept**: Move power rails ($V_{DD}$, $V_{SS}$) to wafer backside - **Benefit**: Free up front-side routing resources - **Implementation**: Requires wafer thinning to $\approx 500 \text{ nm}$ ``` Traditional BEOL vs. BSPDN ──────────────────────────────────────────────────── Signal + Power Signal Only ┌─────────────┐ ┌─────────────┐ │ Metal 15 │ │ Metal 10 │ │ : │ │ : │ │ Metal 1 │ │ Metal 1 │ ├─────────────┤ ├─────────────┤ │ Transistors │ │ Transistors │ └─────────────┘ ├─────────────┤ │ Nano-TSVs │ ├─────────────┤ │ Power Rails │ └─────────────┘ ``` ### 6.2 Hybrid Bonding for 3D Integration Inter-die connection pitch scaling: $$ \text{Density} \propto \frac{1}{(\text{Pitch})^2} $$ | Technology | Pitch (μm) | Density (connections/mm²) | |------------|------------|---------------------------| | Micro-bump | 40-55 | ~400 | | Hybrid bonding | 0.9-10 | $10^4 - 10^6$ | ## 7. Process Control & Metrology ### 7.1 Key Measurements - **Sheet Resistance** ($R_s$): $$ R_s = \frac{\rho}{t} \quad [\Omega/\square] $$ - **Line Resistance**: $$ R_{line} = R_s \cdot \frac{L}{W} $$ - **Via Resistance**: $$ R_{via} = R_c + \rho \cdot \frac{h}{\pi r^2} $$ Where $R_c$ = contact resistance ### 7.2 Reliability Testing - **Electromigration (EM)**: Accelerated at high $j$ and $T$ - **Stress Migration (SM)**: Void formation under mechanical stress - **Time-Dependent Dielectric Breakdown (TDDB)**: $$ \text{TDDB} \propto \exp\left(-\gamma E - \frac{E_a}{k_B T}\right) $$ ## 8. FEOL vs. BEOL | Aspect | FEOL | BEOL | |--------|------|------| | **Focus** | Transistors, active devices | Interconnects, wiring | | **Materials** | Si, high-$\kappa$ oxides, metal gates | Cu, low-$\kappa$ dielectrics, barriers | | **Max Temperature** | > 1000°C | < 400°C (Cu compatible) | | **Key Metric** | $I_{on}/I_{off}$, $V_{th}$ | RC delay, $\rho_{eff}$ | | **Scaling Challenge** | Leakage, short-channel effects | Resistance, reliability |
Masked language model for understanding tasks.
BERT4Rec applies bidirectional transformer to sequential recommendation through masked item prediction.
Embedding-based similarity.
BERTScore computes semantic similarity between generated and reference text.
Use BERT embeddings to measure semantic similarity.
External pre-release testing.
VAE with adjustable disentanglement.
Design for typical not worst case.
Angled wafer edge.
Rounded wafer edge prevents chipping.
Beyond-accuracy objectives include diversity novelty serendipity and coverage in recommendation systems.
Technologies after conventional transistors.
BF16 has same exponent as FP32, less precision. More stable than FP16. Preferred for training.
16-bit floating point format.
Size of solder balls.
Spacing between balls.
BGA X-ray inspection detects solder joint defects like voids bridges and insufficient wetting in ball grid array packages.
Encode query and documents separately.
Bi-encoders separately encode queries and documents for efficient similarity search.