Home Knowledge Base Perplexity and Cross-Entropy Loss

Perplexity and Cross-Entropy Loss

Cross-Entropy Loss Explained Cross-entropy loss is the primary training objective for language models. It measures how well the model's predicted probability distribution matches the actual next token.

Mathematical Definition For a sequence of tokens with predictions p and true labels y: $$ Loss = -\frac{1}{N} \sum_{i=1}^{N} \log P(y_i | x_{

Lower loss means the model assigns higher probability to correct tokens.

Example If the model predicts:

And the correct next word is "the":

If "the" was predicted with 90% probability:

Perplexity

What is Perplexity? Perplexity (PPL) is the exponentiated cross-entropy loss. It represents "how confused" the model is about predictions.

Formula $$ PPL = exp(Loss) = exp\left(-\frac{1}{N} \sum \log P(y_i)\right) $$

Interpreting Perplexity

PerplexityInterpretation
1Perfect prediction (impossible in practice)
10-20Excellent for domain-specific models
20-50Good general language model
50-100Average quality
>100Poor, model is "confused"

Intuitive Meaning A perplexity of 50 means the model is as uncertain as if it were choosing uniformly among 50 possible tokens.

Practical Use

perplexitylosscross-entropy

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.