r-gcn, r-gcn, graph neural networks
Relational Graph Convolutional Networks extend GCNs to multi-relational data by learning separate transformation matrices for each edge type.
145 technical terms and definitions
Relational Graph Convolutional Networks extend GCNs to multi-relational data by learning separate transformation matrices for each edge type.
Rademacher complexity bounds generalization error by measuring how well a function class can fit random noise.
Describe medical images in text.
# RAG, Retrieval, and Knowledge Bases A comprehensive technical guide with mathematical foundations ## 1. Overview **RAG (Retrieval-Augmented Generation)** is an architecture that enhances Large Language Models (LLMs) by grounding their responses in external knowledge sources. ### Core Components - **Generator**: The LLM that produces the final response - **Retriever**: The system that finds relevant documents - **Knowledge Base**: The corpus of documents being searched ## 2. Mathematical Foundations ### 2.1 Vector Embeddings Documents and queries are converted to dense vectors in $\mathbb{R}^d$ where $d$ is the embedding dimension (typically 384, 768, or 1536). **Embedding Function:** $$ E: \text{Text} \rightarrow \mathbb{R}^d $$ For a document $D$ and query $Q$: $$ \vec{d} = E(D) \in \mathbb{R}^d $$ $$ \vec{q} = E(Q) \in \mathbb{R}^d $$ ### 2.2 Similarity Metrics #### Cosine Similarity $$ \text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||} = \frac{\sum_{i=1}^{d} q_i \cdot d_i}{\sqrt{\sum_{i=1}^{d} q_i^2} \cdot \sqrt{\sum_{i=1}^{d} d_i^2}} $$ #### Euclidean Distance (L2) $$ \text{dist}_{L2}(\vec{q}, \vec{d}) = ||\vec{q} - \vec{d}||_2 = \sqrt{\sum_{i=1}^{d} (q_i - d_i)^2} $$ #### Dot Product $$ \text{sim}_{\text{dot}}(\vec{q}, \vec{d}) = \vec{q} \cdot \vec{d} = \sum_{i=1}^{d} q_i \cdot d_i $$ ### 2.3 BM25 (Sparse Retrieval) $$ \text{BM25}(Q, D) = \sum_{i=1}^{n} \text{IDF}(q_i) \cdot \frac{f(q_i, D) \cdot (k_1 + 1)}{f(q_i, D) + k_1 \cdot \left(1 - b + b \cdot \frac{|D|}{\text{avgdl}}\right)} $$ Where: - $f(q_i, D)$ = frequency of term $q_i$ in document $D$ - $|D|$ = document length - $\text{avgdl}$ = average document length in corpus - $k_1$ = term frequency saturation parameter (typically 1.2–2.0) - $b$ = length normalization parameter (typically 0.75) **Inverse Document Frequency (IDF):** $$ \text{IDF}(q_i) = \ln\left(\frac{N - n(q_i) + 0.5}{n(q_i) + 0.5} + 1\right) $$ Where: - $N$ = total number of documents - $n(q_i)$ = number of documents containing $q_i$ ## 3. RAG Pipeline Architecture ### 3.1 Pipeline Stages 1. **Indexing Phase** - Document ingestion - Chunking strategy selection - Embedding generation - Vector storage 2. **Query Phase** - Query embedding: $\vec{q} = E(Q)$ - Top-$k$ retrieval: $\mathcal{D}_k = \text{argmax}_{D \in \mathcal{C}}^k \text{sim}(\vec{q}, \vec{d})$ - Context assembly - LLM generation ### 3.2 Retrieval Formula Given a query $Q$ and corpus $\mathcal{C}$, retrieve top-$k$ documents: $$ \mathcal{D}_k = \{D_1, D_2, ..., D_k\} \quad \text{where} \quad \text{sim}(Q, D_1) \geq \text{sim}(Q, D_2) \geq ... \geq \text{sim}(Q, D_k) $$ ### 3.3 Generation with Context $$ P(\text{Response} | Q, \mathcal{D}_k) = \text{LLM}(Q \oplus \mathcal{D}_k) $$ Where $\oplus$ denotes context concatenation. ## 4. Chunking Strategies ### 4.1 Fixed-Size Chunking - **Chunk size**: $c$ tokens (typically 256–1024) - **Overlap**: $o$ tokens (typically 10–20% of $c$) $$ \text{Number of chunks} = \left\lceil \frac{|D| - o}{c - o} \right\rceil $$ ### 4.2 Semantic Chunking - Split by semantic boundaries (paragraphs, sections) - Use sentence embeddings to detect topic shifts - Threshold: $\theta$ for similarity drop detection $$ \text{Split at } i \quad \text{if} \quad \text{sim}(s_i, s_{i+1}) < \theta $$ ### 4.3 Recursive Chunking - Hierarchical splitting: Document → Sections → Paragraphs → Sentences - Maintains context hierarchy ## 5. Knowledge Base Design ### 5.1 Metadata Schema ```json { "chunk_id": "string", "document_id": "string", "content": "string", "embedding": "vector[d]", "metadata": { "source": "string", "title": "string", "author": "string", "date_created": "ISO8601", "date_modified": "ISO8601", "section": "string", "page_number": "integer", "chunk_index": "integer", "total_chunks": "integer", "tags": ["string"], "confidence_score": "float" } } ``` ### 5.2 Index Types - **Flat Index**: Exact search, $O(n)$ complexity - **IVF (Inverted File)**: Approximate, $O(\sqrt{n})$ complexity - **HNSW (Hierarchical Navigable Small World)**: Graph-based, $O(\log n)$ complexity **HNSW Search Complexity:** $$ O(d \cdot \log n) $$ Where $d$ is embedding dimension and $n$ is corpus size. ## 6. Evaluation Metrics ### 6.1 Retrieval Metrics #### Recall@k $$ \text{Recall@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{|\text{Relevant}|} $$ #### Precision@k $$ \text{Precision@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{k} $$ #### Mean Reciprocal Rank (MRR) $$ \text{MRR} = \frac{1}{|Q|} \sum_{i=1}^{|Q|} \frac{1}{\text{rank}_i} $$ #### Normalized Discounted Cumulative Gain (NDCG) $$ \text{DCG@}k = \sum_{i=1}^{k} \frac{2^{\text{rel}_i} - 1}{\log_2(i + 1)} $$ $$ \text{NDCG@}k = \frac{\text{DCG@}k}{\text{IDCG@}k} $$ ### 6.2 Generation Metrics - **Faithfulness**: Is response grounded in retrieved context? - **Relevance**: Does response answer the query? - **Groundedness Score**: $$ G = \frac{|\text{Claims supported by context}|}{|\text{Total claims}|} $$ ## 7. Advanced Techniques ### 7.1 Hybrid Search Combine dense and sparse retrieval: $$ \text{score}_{\text{hybrid}} = \alpha \cdot \text{score}_{\text{dense}} + (1 - \alpha) \cdot \text{score}_{\text{sparse}} $$ Where $\alpha \in [0, 1]$ is the weighting parameter. ### 7.2 Reranking Apply cross-encoder reranking to top-$k$ results: $$ \text{score}_{\text{rerank}}(Q, D) = \text{CrossEncoder}(Q, D) $$ Cross-encoder complexity: $O(k \cdot |Q| \cdot |D|)$ ### 7.3 Query Expansion - **HyDE (Hypothetical Document Embeddings)**: $$ \vec{q}_{\text{HyDE}} = E(\text{LLM}(Q)) $$ - **Multi-Query Retrieval**: $$ \mathcal{D}_{\text{merged}} = \bigcup_{i=1}^{m} \text{Retrieve}(Q_i) $$ ### 7.4 Contextual Compression Reduce retrieved context before generation: $$ C_{\text{compressed}} = \text{Compress}(\mathcal{D}_k, Q) $$ ## 8. Vector Database Options | Database | Index Types | Hosting | Scalability | |----------|-------------|---------|-------------| | Pinecone | HNSW, IVF | Cloud | High | | Weaviate | HNSW | Self/Cloud | High | | Qdrant | HNSW | Self/Cloud | High | | Milvus | IVF, HNSW | Self/Cloud | Very High | | FAISS | Flat, IVF, HNSW | Self | Medium | | Chroma | HNSW | Self | Low-Medium | | pgvector | IVFFlat, HNSW | Self | Medium | ## 9. Best Practices Checklist - Choose appropriate chunk size based on content type - Implement chunk overlap to preserve context - Store rich metadata for filtering - Use hybrid search for better recall - Implement reranking for precision - Monitor retrieval metrics continuously - Evaluate groundedness of generated responses - Handle edge cases (no results, low confidence) - Implement caching for common queries - Version control your knowledge base ## 10. Code Examples ### 10.1 Cosine Similarity (Python) ```python import numpy as np def cosine_similarity(vec_q: np.ndarray, vec_d: np.ndarray) -> float: """ Calculate cosine similarity between two vectors. $$\text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||}$$ """ dot_product = np.dot(vec_q, vec_d) norm_q = np.linalg.norm(vec_q) norm_d = np.linalg.norm(vec_d) return dot_product / (norm_q * norm_d) ``` ### 10.2 BM25 Implementation ```python import math from collections import Counter def bm25_score( query_terms: list[str], document: list[str], corpus: list[list[str]], k1: float = 1.5, b: float = 0.75 ) -> float: """ Calculate BM25 score for a query-document pair. """ doc_len = len(document) avg_doc_len = sum(len(d) for d in corpus) / len(corpus) doc_freq = Counter(document) N = len(corpus) score = 0.0 for term in query_terms: # Document frequency n_q = sum(1 for d in corpus if term in d) # IDF calculation idf = math.log((N - n_q + 0.5) / (n_q + 0.5) + 1) # Term frequency in document f_q = doc_freq.get(term, 0) # BM25 term score numerator = f_q * (k1 + 1) denominator = f_q + k1 * (1 - b + b * (doc_len / avg_doc_len)) score += idf * (numerator / denominator) return score ```
Combine multiple DQN improvements.
Elevated floor creating space underneath for utilities cabling and air return.
Raised source-drain structures elevate junctions above channel reducing parasitic resistance and capacitance.
Random failures occur unpredictably at constant rate during useful life period.
Approximate attention with random features.
Non-special boundary.
Random routing assigns tokens to experts stochastically.
Sample random hyperparameter combinations.
Use random attention patterns.
Provable robustness via smoothing.
Rare earth recovery extracts valuable lanthanides from electronic waste for reuse.
Rate limiting restricts request frequency preventing abuse and ensuring fair access.
Ray marching samples points along rays through scene accumulating color and opacity.
Responsible Business Alliance establishes labor rights and environmental standards for electronics supply chains.
Adjust sampling to handle imbalance.
Compute reachable output set.
Agent pattern where model alternates between reasoning steps and taking actions.
Suggest optimal reaction conditions.
Extract chemical reactions from text.
Predict products of chemical reactions.
Readout functions aggregate node features into graph-level representations for classification or regression tasks.
Choose appropriate reagents for synthesis.
Real-ESRGAN uses adversarial training for practical blind super-resolution.
End-to-end training of retriever and generator.
Recent examples influence more.
Combine attention with external memory.
Combine recurrence with attention.
Model dynamics with recurrent and stochastic components.
RNN/LSTM for temporal modeling.
Recursive forecasting uses one-step-ahead model iteratively feeding predictions as inputs for multi-step forecasts.
Supervise reward model with another model.
Recursive reward modeling trains reward models using other reward models hierarchically.
Adversarial testing by human experts.
Adversarial testing to find model vulnerabilities weaknesses or harmful behaviors.
Red-teaming probes models for harmful behaviors informing safety training.
Use reference image to guide.
Reference images guide generation by transferring style or content through adapter networks.
Find objects from descriptions.
Generate descriptions of objects.
Reflection enables agents to critique their own plans and outputs identifying improvements.
Agent learns from feedback and mistakes by generating reflections and improving.
Reformer uses locality-sensitive hashing for approximate attention matching.
Use LSH attention to reduce complexity from quadratic to linear.
Model declining to answer.
Balance safety and helpfulness.
Refusal training explicitly teaches models when to decline requests.