token dropping, llm architecture
Token dropping discards excess tokens when expert capacity is exceeded.
9,967 technical terms and definitions
Token dropping discards excess tokens when expert capacity is exceeded.
Skip processing some tokens to save compute.
Token forcing mandates specific tokens at designated positions.
Fix tokenization artifacts.
Assign importance scores to determine computation allocation.
Assign labels to patches.
Assign pseudo-labels to tokens.
Maximum prompt length.
How to combine tokens.
Combine similar tokens to reduce sequence length.
Remove unimportant tokens.
Token streaming sends individual tokens immediately rather than waiting for completion.
Search over possible continuations.
Training tokens per parameter.
Token: Subword unit. GPT uses BPE tokenization. ~1 token = 4 characters. Models predict probability of next token.
Problems from subword tokenization.
Ensure consistent tokenization.
Preprocessing before tokenization.
Protect tokens from injection or manipulation.
Tokenization splits text into tokens. BPE, SentencePiece, tiktoken. Vocabulary size trade-off (32K-100K typical).
Split text into tokens (subwords characters) for model input.
Create tokenizer vocabulary.
Tokenizer splits text into tokens. BPE/WordPiece/SentencePiece trade off vocabulary size vs. sequence length and handle multilingual text differently.
Tokenizers library provides fast tokenization. Rust implementation. Hugging Face.
Language model training speed.
Tolerance design balances specification tightness with cost and capability.
Allowed variation from target.
Component standing on end.
Percentage of time tool is ready.
Tool calling agents invoke external functions APIs or resources to accomplish tasks.
Validate tool call arguments before execution.
Ensure sufficient equipment.
Contamination from equipment.
Tool discovery enables agents to find and learn about available functions dynamically.
Tool documentation describes function capabilities parameters and expected outputs for agent understanding.
Tool idle management powers down unused equipment components reducing standby energy consumption.
Keep tools similar.
Detailed capability requirements.
Verify tool performance post-maintenance.
Validate equipment meets requirements.
Tool result parsing extracts relevant information from function outputs for agent reasoning.
Tool selection chooses appropriate functions from available repertoire for current needs.
Choose appropriate tool.
Chain multiple tools.
Group of tools performing same process step.
LLM decides when and how to call external APIs tools or functions.
Assess tool-using capabilities.
Teach models to use external tools.
Tool use enables language models to invoke external functions APIs or search engines.
Maximize productive time.