Mechanism

Greedy decoding selects the highest probability token at each step, providing deterministic output. Mechanism: At each position, pick argmax over vocabulary, feed selected token as next input, repeat until end token or max length. Advantages: Fast (single forward pass per token), deterministic/reproducible, simple to implement, no hyperparameters. Limitations: Can't recover from early mistakes (no backtracking), often produces repetitive text loops, misses high-probability sequences ("the the the" trap), lacks diversity. When appropriate: Factual QA where diversity harmful, code completion where correctness critical, structured outputs with clear answers, benchmarking/evaluation needing reproducibility. When to avoid: Creative writing, open-ended chat, tasks needing variety. Repetition problem: Greedy often gets stuck in loops - mitigation requires repetition penalty or n-gram blocking. Comparison: Beam search explores multiple paths, sampling adds randomness, both generally produce better text quality for generative tasks. Greedy remains useful for specific deterministic applications.

Want to learn more?