model artifact management, mlops
Store and version model checkpoints.
288 technical terms and definitions
Store and version model checkpoints.
Model artifacts store trained models. Weights, config, metadata.
Average predictions from multiple models.
Model cards document model details performance and limitations for transparency.
Documentation of model capabilities limitations and biases.
Model cards document model details: training, limitations, intended use. Responsible AI practice.
Model cards document model capabilities, limitations, intended use, bias. Standard for responsible release.
Standardized model documentation.
Exhaustively verify system properties.
Reduce model size for edge devices.
Reduce model size for phones and IoT devices.
Model compression reduces size and computation of neural networks maintaining performance.
Techniques to reduce model size (pruning quantization distillation).
Model conversion translates trained models between frameworks and formats for deployment.
Distinguish between competing models.
Distill into interpretable student model.
Directly update model weights to fix specific factual errors or behaviors.
Model ensemble reinforcement learning trains multiple dynamics models to quantify uncertainty and improve decision-making robustness.
Model evaluation measures performance across metrics and test sets.
Steal model by querying it repeatedly.
Model extraction attacks replicate model functionality through black-box querying.
Model extraction steals model via queries. Build substitute model. Protect with rate limiting.
Model fingerprints identify models from behavior. Unique responses to probe inputs.
Effective compute utilization.
Hugging Face Hub hosts open models and datasets. Download weights, run locally, fine-tune, share.
Reconstruct training data from model parameters or outputs.
Reconstruct training data from model.
Prevent reconstruction of training data.
Model inversion attacks reconstruct training data from model parameters or outputs.
Combine weights from multiple fine-tuned models to get benefits of both.
Track model performance metrics and detect degradation.
Techniques to split models across GPUs (tensor pipeline expert).
Split model layers across devices each device has subset of parameters.
Use predictive models for process control.
Model predictive control optimizes future actions using process models and constraints.
Optimize actions using predictive model.
Central repository for storing and versioning trained models.
Model registry versions and stages models. MLflow, W&B, SageMaker.
Periodically retrain model on fresh data to maintain performance.
Model routing directs requests to appropriate models based on query characteristics.
Model servers (vLLM, TGI, Triton) host models for inference. Handle batching, scaling, API.
Infrastructure for deploying models (Seldon KServe BentoML).
Infrastructure to deploy models and handle inference requests.
Disk space required to store model weights.
Average fine-tuned models.
Replicate model by querying.
Connect different model parts.
Connect different model parts.
Model extraction attacks steal model via API queries. Protect with rate limits, output perturbation, watermarks.
Verify model hasn't been tampered with.