Dense retrieval uses learned embedding vectors to find semantically relevant documents — encoding queries and documents into dense vector representations using bi-encoder models, then finding nearest neighbors in embedding space, enabling semantic search that understands meaning rather than relying on exact keyword matches.
How Dense Retrieval Works
- Bi-Encoder: Separate encoders for queries and documents produce independent embeddings.
- Indexing: Pre-compute document embeddings, store in vector database.
- Search: Encode query, find nearest document vectors via ANN search.
- Speed: Sub-millisecond search over millions of documents.
Advantages Over Sparse Retrieval (BM25)
- Semantic Understanding: "car" matches "automobile" and "vehicle."
- Zero-Shot: Works for unseen queries without keyword overlap.
- Multilingual: Cross-language retrieval with multilingual encoders.
Limitations: May miss exact keyword matches; hybrid (dense + sparse) retrieval often works best.
Dense retrieval powers modern RAG pipelines — enabling LLMs to find relevant context through semantic understanding rather than keyword matching.
dense retrievalbi encoderembedding
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.