Home Knowledge Base Magnitude Pruning

Magnitude pruning removes weights with the smallest absolute values based on the assumption that low-magnitude weights contribute less to model output, while saliency-based methods additionally consider gradient information for more informed pruning decisions. Magnitude pruning: rank weights by |w|; remove lowest percentile; simple and surprisingly effective. Intuition: small weights have small effect on output; removing them causes minimal accuracy loss. Iteration: alternate pruning and retraining—remove weights, fine-tune remaining, repeat; gradual pruning outperforms one-shot. Saliency metrics: consider both magnitude and gradient: |w × ∂L/∂w| (Fisher pruning), Taylor expansion, or second-order methods (Hessian-based). Movement pruning: during fine-tuning, remove weights that are moving toward zero; captures training dynamics. Structured versus unstructured: magnitude applies to individual weights (unstructured) or entire filters/heads (structured); structured gives actual speedup. Lottery ticket hypothesis: sparse subnetworks exist at initialization that can train to full accuracy; magnitude identifies winning tickets. Sparsity targets: 80-95% sparsity often achievable with minimal accuracy loss; depends on model and task. Hardware support: sparse tensor cores (Ampere+) accelerate structured sparsity; unstructured requires high sparsity for benefit. Global versus local: prune globally (all layers compete) or local (per-layer quotas); global typically better but may empty some layers. Retraining: post-pruning fine-tuning essential for recovering accuracy. Magnitude pruning is foundational technique for model compression.

magnitude pruningsaliencyimportance

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.