Vector databases are specialized storage systems optimized for storing, indexing, and searching high-dimensional embedding vectors — enabling fast similarity search across millions to billions of vectors, essential infrastructure for RAG systems, semantic search, recommendation engines, and any application requiring finding "similar" items in embedding space.
What Are Vector Databases?
- Definition: Databases designed to store and query vector embeddings.
- Core Operation: Find K nearest neighbors to a query vector.
- Scale: Handle millions to billions of vectors efficiently.
- Beyond Search: Support filtering, metadata, hybrid search.
Why Vector Databases Matter
- RAG Foundation: Enable retrieval-augmented generation for LLMs.
- Semantic Search: Find meaning, not just keywords.
- Scale: Brute-force O(n) search doesn't scale; need efficient indexes.
- Production Features: CRUD, filtering, replication, backups.
- Speed: Sub-100ms queries across millions of vectors.
- Accuracy: Trade-off with speed, configurable.
Core Concepts
Embedding Vectors:
- Dense numerical representations of data (text, images, etc.).
- Typical dimensions: 384, 768, 1024, 1536, 3072.
- Similar items = similar vectors (close in space).
Distance Metrics:
Metric | Formula | Use Case
--------------------|----------------------------|------------------
Cosine Similarity | 1 - (A·B)/(|A||B|) | Text embeddings
Euclidean (L2) | sqrt(Σ(ai-bi)²) | Image features
Dot Product (IP) | A·B | Normalized vectors
Index Types:
- Flat/Brute-force: Exact, O(n), for small datasets.
- IVF (Inverted File): Cluster-based approximate search.
- HNSW: Graph-based, high recall, more memory.
- PQ (Product Quantization): Compressed vectors, low memory.
Major Vector Databases
Dedicated Vector DBs:
Database | Highlights | Best For
-----------|-----------------------------------|------------------
FAISS | Meta, library, CPU/GPU | Research, embedded
Milvus | Distributed, scalable, open source| Large-scale prod
Qdrant | Rust, filtering, rich features | Production RAG
Pinecone | Managed, serverless, easy | Quick start, scale
Weaviate | Hybrid search, GraphQL | Complex queries
ChromaDB | Simple, embedded, dev-friendly | Prototyping, local
Database Extensions:
- pgvector: PostgreSQL extension for vectors.
- Elasticsearch: Dense vector support added.
- Redis: Vector similarity search module.
Performance Comparison
Database | Vectors | QPS (K=10) | Recall@10
------------|-----------|------------|----------
Milvus | 1B | 2,000+ | 95%+
Qdrant | 100M | 5,000+ | 98%+
Pinecone | 1B | ~1,000 | 95%+
pgvector | 10M | ~500 | 99%+
ChromaDB | 1M | ~1,000 | 99%+
Varies significantly by hardware, index config, vector dimension
RAG Architecture with Vector DB
User Query: "How does photosynthesis work?"
↓
┌─────────────────────────────────────────┐
│ Embed query → [0.23, -0.45, ...] │
├─────────────────────────────────────────┤
│ Vector DB similarity search │
│ → Find top 5 most similar chunks │
├─────────────────────────────────────────┤
│ Retrieved context + original query │
├─────────────────────────────────────────┤
│ LLM generates response with context │
└─────────────────────────────────────────┘
↓
Response: "Photosynthesis is the process by which..."
Key Features to Consider
- Hybrid Search: Combine vector + keyword (BM25) search.
- Filtering: Query vectors with metadata constraints.
- Multi-Tenancy: Isolate data between customers.
- Replication: High availability and disaster recovery.
- Updates: Efficient insert/update/delete operations.
- Cost: Managed vs. self-hosted economics.
Selection Criteria
- Scale: How many vectors? (Millions → Milvus/Pinecone).
- Simplicity: Quick start? (ChromaDB, Pinecone).
- Self-Hosted: Control needed? (Milvus, Qdrant, FAISS).
- Features: Hybrid search? Filtering? (Weaviate, Qdrant).
- Existing Stack: Use Postgres? (pgvector).
Vector databases are the infrastructure foundation for semantic AI applications — as more applications need to find "similar" rather than "exact" matches, vector databases provide the scalable, fast retrieval that makes RAG, recommendation systems, and semantic search practical at production scale.
Explore 500+ Semiconductor & AI Topics
From EUV lithography to CUDA optimization — search the full knowledge base or chat with our AI assistant.