← Back to AI Factory Chat

AI Factory Glossary

107 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 3 of 3 (107 entries)

isotonic regression,ai safety

**Isotonic Regression** is a non-parametric calibration technique that fits a monotonically non-decreasing step function to map a model's raw prediction scores to calibrated probabilities, without assuming any specific functional form for the calibration mapping. The method partitions the score range into bins where the calibrated probability within each bin equals the empirical accuracy, subject to the constraint that the mapping is monotonically increasing. **Why Isotonic Regression Matters in AI/ML:** Isotonic regression provides **flexible, assumption-free calibration** that can correct arbitrary distortions in a model's probability estimates—including non-linear miscalibration patterns that parametric methods like Platt scaling cannot capture. • **Non-parametric flexibility** — Unlike Platt scaling (which assumes a sigmoid calibration curve), isotonic regression makes no assumptions about the shape of the miscalibration; it can correct S-shaped, concave, step-wise, or arbitrarily distorted probability mappings • **Monotonicity constraint** — The only assumption is that higher model scores should correspond to higher true probabilities (monotonicity); this minimal constraint preserves the model's ranking while adjusting the probability magnitudes • **Pool Adjacent Violators (PAV) algorithm** — Isotonic regression is solved efficiently by the PAV algorithm: scores are sorted, and whenever the monotonicity constraint is violated (a higher score has lower observed accuracy), the violating groups are merged and their probabilities averaged • **Calibration quality** — With sufficient data, isotonic regression achieves better calibration than Platt scaling because it can model complex miscalibration patterns; however, it requires more calibration data (5,000-10,000 examples) to avoid overfitting • **Step function output** — The calibrated mapping is a step function with as many steps as distinct score-accuracy groups; for smooth probabilities, the output can be further smoothed with interpolation | Property | Isotonic Regression | Platt Scaling | |----------|-------------------|---------------| | Parametric | No (non-parametric) | Yes (2 parameters) | | Flexibility | Arbitrary monotone mapping | Sigmoid only | | Data Requirements | 5,000-10,000 examples | 1,000-5,000 examples | | Overfitting Risk | Higher (with small data) | Lower (constrained) | | Calibration Quality | Better (with enough data) | Good (if sigmoid appropriate) | | Output Shape | Step function | Smooth sigmoid | | Multiclass | One-vs-all | Temperature scaling | **Isotonic regression is the most flexible post-hoc calibration technique available, providing non-parametric, assumption-free correction of arbitrary probability miscalibration patterns while preserving the model's ranking, making it the preferred calibration method when sufficient validation data is available and the miscalibration pattern is complex or unknown.**

issue triaging, code ai

**Issue Triaging** is the **code AI task of automatically classifying, prioritizing, assigning, and de-duplicating bug reports and feature requests in software issue trackers** — enabling development teams to process incoming GitHub Issues, Jira tickets, and Bugzilla reports at scale without the triaging bottleneck that delays critical bug fixes, causes duplicate work, and leaves important user feedback unaddressed. **What Is Issue Triaging?** - **Input**: Issue title, description body, labels, reporter information, linked code references, and similar existing issues. - **Triage Actions**: - **Classification**: Bug vs. feature request vs. documentation vs. question vs. enhancement. - **Priority Assignment**: Critical / High / Medium / Low based on impact and urgency. - **Component Assignment**: Which team, repository, or subsystem owns this issue. - **Duplicate Detection**: Does this issue already exist under a different title? - **Assignee Recommendation**: Which developer has the relevant expertise and capacity? - **Label Application**: Apply standardized labels from project taxonomy. - **Status Routing**: Close as "won't fix," "needs more info," or move to sprint planning. - **Key Benchmarks**: GHTorrent (GitHub archive), Bugzilla DBs (Mozilla, Eclipse, NetBeans), GitHub Issues corpora, DeepTriage (Microsoft). **The Triaging Scale Problem** At scale, issue triaging is a significant operational burden: - VS Code: ~5,000 new GitHub issues/month; 180,000+ total open/closed issues. - Linux Kernel: ~15,000 bug reports/year across multiple subsystems. - Android AOSP: ~50,000+ issues tracked across hundreds of components. Manual triaging requires a dedicated team of engineers who could otherwise be writing code. Microsoft published that automated triage for VS Code reduces manual triaging effort by 60%. **Technical Tasks in Detail** **Bug Report Classification**: - Fine-tuned BERT/RoBERTa on labeled issue datasets. - Accuracy ~88-92% for binary bug/not-bug classification. - Harder: 7-class granular classification (performance, crash, security, UI, documentation, etc.) achieves ~72-80%. **Duplicate Issue Detection**: - Semantic similarity between new issue and all existing open issues. - Siamese network or bi-encoder models comparing issue titles and bodies. - Challenge: "App crashes when clicking back button" and "SegFault on navigation back gesture" are duplicates despite zero lexical overlap. - Best models achieve ~85% precision@5 for duplicate retrieval. **Priority Prediction**: - Regress or classify priority from issue text features + reporter history + code component affected. - Imbalanced task: most issues are medium priority; critical bugs are rare. - Microsoft DeepTriage: 85% accuracy on 3-class priority with bug-specific features. **Assignee Recommendation**: - Predict which developer on the team should fix a given bug based on code ownership, expertise profile, and recent contribution history. - Hybrid: Text similarity to past issues + code file ownership graph + developer workload. - Accuracy: ~70-78% for top-3 assignee recommendation on established projects. **Why Issue Triaging Matters** - **Developer Productivity**: Developers interrupted by triage duties lose flow state repeatedly. Automated first-pass triage lets human reviewers focus only on edge cases requiring judgment. - **SLA Compliance**: Enterprise software support contracts define response-time SLAs by severity. Automated severity classification ensures SLA routing happens immediately on ticket creation. - **Community Health**: Open source projects with slow issue response rates (weeks to triage) lose contributor trust. Automated triage + quick acknowledgment improves community satisfaction. - **Security Vulnerability Identification**: Automatically detecting security-related issues (crash reports that may indicate exploitable bugs, authentication-related failures) enables faster escalation to security teams. - **Product Roadmap Signal**: Aggregating and classifying thousands of feature requests enables data-driven prioritization of development roadmap items based on frequency and user impact. Issue Triaging is **the intelligent inbox for software development** — automatically classifying, prioritizing, routing, and deduplicating the continuous stream of user-reported bugs and feature requests that would otherwise overwhelm development teams, ensuring that critical issues reach the right engineers immediately while noise and duplicates are filtered efficiently.

iterated amplification, ai safety

**Iterated Amplification** is an **AI alignment technique that bootstraps human oversight by iteratively using AI assistance to solve increasingly complex evaluation tasks** — starting with problems humans can evaluate directly, then using AI-assisted humans to evaluate slightly harder problems, and continuing to expand the frontier of evaluable tasks. **Amplification Process** - **Base Case**: Human evaluates simple AI outputs directly — standard RLHF. - **Amplification Step**: For harder tasks, decompose into sub-problems that a human-with-AI-assistant can evaluate. - **Iteration**: The AI assistant itself was trained using the previous round's amplified evaluator. - **Distillation**: Train a new model to mimic the amplified evaluator — producing a standalone, efficient model. **Why It Matters** - **Scalable Oversight**: Enables evaluation of AI outputs that are too complex for unaided human judgment. - **Alignment Path**: Provides a concrete path to aligning superhuman AI — evaluation capability grows with AI capability. - **Decomposition**: Complex tasks are decomposed into human-manageable sub-problems — divide and conquer for alignment. **Iterated Amplification** is **growing the evaluator alongside the AI** — bootstrapping human oversight to keep pace with increasingly capable AI systems.

iterated amplification, ai safety

**Iterated Amplification** is **an alignment approach where hard tasks are recursively decomposed into easier subproblems humans can supervise** - It is a core method in modern AI safety execution workflows. **What Is Iterated Amplification?** - **Definition**: an alignment approach where hard tasks are recursively decomposed into easier subproblems humans can supervise. - **Core Mechanism**: Model and human collaboration expands effective oversight by chaining simpler evaluable steps. - **Operational Scope**: It is applied in AI safety engineering, alignment governance, and production risk-control workflows to improve system reliability, policy compliance, and deployment resilience. - **Failure Modes**: Poor decomposition quality can propagate early mistakes into final judgments. **Why Iterated Amplification Matters** - **Outcome Quality**: Better methods improve decision reliability, efficiency, and measurable impact. - **Risk Management**: Structured controls reduce instability, bias loops, and hidden failure modes. - **Operational Efficiency**: Well-calibrated methods lower rework and accelerate learning cycles. - **Strategic Alignment**: Clear metrics connect technical actions to business and sustainability goals. - **Scalable Deployment**: Robust approaches transfer effectively across domains and operating conditions. **How It Is Used in Practice** - **Method Selection**: Choose approaches by risk profile, implementation complexity, and measurable impact. - **Calibration**: Validate decomposition trees and include cross-check mechanisms between branches. - **Validation**: Track objective metrics, compliance rates, and operational outcomes through recurring controlled reviews. Iterated Amplification is **a high-impact method for resilient AI execution** - It provides a path toward supervising complex reasoning beyond direct human capacity.

iteration / step,model training

An iteration or step is one update of model weights after processing one batch, the atomic unit of training. **Definition**: Forward pass on batch, compute loss, backward pass, optimizer step = one iteration. **Relationship to epochs**: steps_per_epoch = dataset_size / batch_size. Total steps = epochs x steps_per_epoch. **LLM training**: Often measured in steps rather than epochs. Millions of steps for large models. **What happens each step**: Load batch, forward pass, compute loss, backward pass (gradients), optimizer update, (optional logging). **With gradient accumulation**: Logical step may span multiple forward-backward passes before optimizer update. **Logging frequency**: Log every N steps (e.g., 100). Too frequent is expensive, too infrequent misses issues. **Checkpointing**: Save model every N steps or epochs. Balance between safety and storage. **Learning rate per step**: Most schedulers update LR per step, not per epoch. Smoother adaptation. **Steps vs samples**: Sometimes report samples (steps x batch size) for comparisons across batch sizes. **Progress tracking**: Steps are wall-clock-neutral metric. Epochs depend on dataset size.

iterative magnitude pruning,model optimization

**Iterative Magnitude Pruning (IMP)** is the **standard algorithm for finding Lottery Tickets** — repeatedly cycling through training, pruning the smallest weights, and rewinding to the original initialization until the desired sparsity is reached. **What Is IMP?** - **Algorithm**: 1. Initialize network with $ heta_0$. 2. Train to convergence -> $ heta_T$. 3. Prune bottom $p\%$ by magnitude. 4. Reset surviving weights to $ heta_0$ (or $ heta_k$ for Late Rewinding). 5. Repeat from step 2 until target sparsity. - **Cost**: Very expensive. Requires full training $N$ times for $N$ pruning rounds. **Why It Matters** - **Gold Standard**: The definitive method for finding winning tickets (benchmarking other methods). - **Trade-off**: Achieves the best accuracy at high sparsity, but at extreme computational cost. - **Research Driver**: The high cost of IMP motivates research into cheap ticket-finding methods. **Iterative Magnitude Pruning** is **the brute-force search for the essential network** — expensive but proven to find the sparsest accurate sub-networks.

iterative pruning, model optimization

**Iterative Pruning** is **a staged pruning process that alternates parameter removal and recovery training** - It preserves performance better than aggressive one-pass sparsification. **What Is Iterative Pruning?** - **Definition**: a staged pruning process that alternates parameter removal and recovery training. - **Core Mechanism**: Small pruning increments are applied over multiple cycles with fine-tuning between steps. - **Operational Scope**: It is applied in model-optimization workflows to improve efficiency, scalability, and long-term performance outcomes. - **Failure Modes**: Too many cycles can increase training cost with limited extra gains. **Why Iterative Pruning Matters** - **Outcome Quality**: Better methods improve decision reliability, efficiency, and measurable impact. - **Risk Management**: Structured controls reduce instability, bias loops, and hidden failure modes. - **Operational Efficiency**: Well-calibrated methods lower rework and accelerate learning cycles. - **Strategic Alignment**: Clear metrics connect technical actions to business and sustainability goals. - **Scalable Deployment**: Robust approaches transfer effectively across domains and operating conditions. **How It Is Used in Practice** - **Method Selection**: Choose approaches by latency targets, memory budgets, and acceptable accuracy tradeoffs. - **Calibration**: Set cycle count and prune ratio per cycle based on accuracy recovery curves. - **Validation**: Track accuracy, latency, memory, and energy metrics through recurring controlled evaluations. Iterative Pruning is **a high-impact method for resilient model-optimization execution** - It is a robust strategy for high-sparsity targets with controlled risk.