Home Knowledge Base Attention sink

Attention sink is the phenomenon where certain tokens attract disproportionate attention mass, reducing effective use of other context tokens - it can degrade long-context quality when not managed in prompt and model design.

What Is Attention sink?

Why Attention sink Matters

How It Is Used in Practice

Attention sink is a critical diagnostic concept for long-context reliability - monitoring and mitigating sink behavior improves evidence utilization in RAG workloads.

attention sinkarchitecture

Explore 500+ Semiconductor & AI Topics

From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.