r-gcn, r-gcn, graph neural networks
Relational Graph Convolutional Networks extend GCNs to multi-relational data by learning separate transformation matrices for each edge type.
536 technical terms and definitions
Relational Graph Convolutional Networks extend GCNs to multi-relational data by learning separate transformation matrices for each edge type.
R-squared measures proportion of response variation explained by model.
Fab dedicated to research and development.
Adjust process recipe between runs based on previous results.
Educational exam questions.
ReAding Comprehension from Examinations tests understanding of passages.
Racial bias shows as disparate treatment or stereotypes based on race or ethnicity.
RAD-TTS uses normalizing flows with learned prior for robust alignment-free text-to-speech.
Adam with variance warmup.
Rademacher complexity bounds generalization error by measuring how well a function class can fit random noise.
Variation from center to edge.
Radial patterns on wafer maps emanate from center suggesting spin-coating or thermal gradient issues.
Radiation heat transfer becomes significant at high temperatures contributing to cooling in open environments.
Recombination emitting photons.
Describe medical images in text.
Iterative refinement for flow.
Systematic RAG assessment.
Single retrieval for entire sequence.
Retrieve for each generated token.
# RAG, Retrieval, and Knowledge Bases A comprehensive technical guide with mathematical foundations ## 1. Overview **RAG (Retrieval-Augmented Generation)** is an architecture that enhances Large Language Models (LLMs) by grounding their responses in external knowledge sources. ### Core Components - **Generator**: The LLM that produces the final response - **Retriever**: The system that finds relevant documents - **Knowledge Base**: The corpus of documents being searched ## 2. Mathematical Foundations ### 2.1 Vector Embeddings Documents and queries are converted to dense vectors in $\mathbb{R}^d$ where $d$ is the embedding dimension (typically 384, 768, or 1536). **Embedding Function:** $$ E: \text{Text} \rightarrow \mathbb{R}^d $$ For a document $D$ and query $Q$: $$ \vec{d} = E(D) \in \mathbb{R}^d $$ $$ \vec{q} = E(Q) \in \mathbb{R}^d $$ ### 2.2 Similarity Metrics #### Cosine Similarity $$ \text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||} = \frac{\sum_{i=1}^{d} q_i \cdot d_i}{\sqrt{\sum_{i=1}^{d} q_i^2} \cdot \sqrt{\sum_{i=1}^{d} d_i^2}} $$ #### Euclidean Distance (L2) $$ \text{dist}_{L2}(\vec{q}, \vec{d}) = ||\vec{q} - \vec{d}||_2 = \sqrt{\sum_{i=1}^{d} (q_i - d_i)^2} $$ #### Dot Product $$ \text{sim}_{\text{dot}}(\vec{q}, \vec{d}) = \vec{q} \cdot \vec{d} = \sum_{i=1}^{d} q_i \cdot d_i $$ ### 2.3 BM25 (Sparse Retrieval) $$ \text{BM25}(Q, D) = \sum_{i=1}^{n} \text{IDF}(q_i) \cdot \frac{f(q_i, D) \cdot (k_1 + 1)}{f(q_i, D) + k_1 \cdot \left(1 - b + b \cdot \frac{|D|}{\text{avgdl}}\right)} $$ Where: - $f(q_i, D)$ = frequency of term $q_i$ in document $D$ - $|D|$ = document length - $\text{avgdl}$ = average document length in corpus - $k_1$ = term frequency saturation parameter (typically 1.2–2.0) - $b$ = length normalization parameter (typically 0.75) **Inverse Document Frequency (IDF):** $$ \text{IDF}(q_i) = \ln\left(\frac{N - n(q_i) + 0.5}{n(q_i) + 0.5} + 1\right) $$ Where: - $N$ = total number of documents - $n(q_i)$ = number of documents containing $q_i$ ## 3. RAG Pipeline Architecture ### 3.1 Pipeline Stages 1. **Indexing Phase** - Document ingestion - Chunking strategy selection - Embedding generation - Vector storage 2. **Query Phase** - Query embedding: $\vec{q} = E(Q)$ - Top-$k$ retrieval: $\mathcal{D}_k = \text{argmax}_{D \in \mathcal{C}}^k \text{sim}(\vec{q}, \vec{d})$ - Context assembly - LLM generation ### 3.2 Retrieval Formula Given a query $Q$ and corpus $\mathcal{C}$, retrieve top-$k$ documents: $$ \mathcal{D}_k = \{D_1, D_2, ..., D_k\} \quad \text{where} \quad \text{sim}(Q, D_1) \geq \text{sim}(Q, D_2) \geq ... \geq \text{sim}(Q, D_k) $$ ### 3.3 Generation with Context $$ P(\text{Response} | Q, \mathcal{D}_k) = \text{LLM}(Q \oplus \mathcal{D}_k) $$ Where $\oplus$ denotes context concatenation. ## 4. Chunking Strategies ### 4.1 Fixed-Size Chunking - **Chunk size**: $c$ tokens (typically 256–1024) - **Overlap**: $o$ tokens (typically 10–20% of $c$) $$ \text{Number of chunks} = \left\lceil \frac{|D| - o}{c - o} \right\rceil $$ ### 4.2 Semantic Chunking - Split by semantic boundaries (paragraphs, sections) - Use sentence embeddings to detect topic shifts - Threshold: $\theta$ for similarity drop detection $$ \text{Split at } i \quad \text{if} \quad \text{sim}(s_i, s_{i+1}) < \theta $$ ### 4.3 Recursive Chunking - Hierarchical splitting: Document → Sections → Paragraphs → Sentences - Maintains context hierarchy ## 5. Knowledge Base Design ### 5.1 Metadata Schema ```json { "chunk_id": "string", "document_id": "string", "content": "string", "embedding": "vector[d]", "metadata": { "source": "string", "title": "string", "author": "string", "date_created": "ISO8601", "date_modified": "ISO8601", "section": "string", "page_number": "integer", "chunk_index": "integer", "total_chunks": "integer", "tags": ["string"], "confidence_score": "float" } } ``` ### 5.2 Index Types - **Flat Index**: Exact search, $O(n)$ complexity - **IVF (Inverted File)**: Approximate, $O(\sqrt{n})$ complexity - **HNSW (Hierarchical Navigable Small World)**: Graph-based, $O(\log n)$ complexity **HNSW Search Complexity:** $$ O(d \cdot \log n) $$ Where $d$ is embedding dimension and $n$ is corpus size. ## 6. Evaluation Metrics ### 6.1 Retrieval Metrics #### Recall@k $$ \text{Recall@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{|\text{Relevant}|} $$ #### Precision@k $$ \text{Precision@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{k} $$ #### Mean Reciprocal Rank (MRR) $$ \text{MRR} = \frac{1}{|Q|} \sum_{i=1}^{|Q|} \frac{1}{\text{rank}_i} $$ #### Normalized Discounted Cumulative Gain (NDCG) $$ \text{DCG@}k = \sum_{i=1}^{k} \frac{2^{\text{rel}_i} - 1}{\log_2(i + 1)} $$ $$ \text{NDCG@}k = \frac{\text{DCG@}k}{\text{IDCG@}k} $$ ### 6.2 Generation Metrics - **Faithfulness**: Is response grounded in retrieved context? - **Relevance**: Does response answer the query? - **Groundedness Score**: $$ G = \frac{|\text{Claims supported by context}|}{|\text{Total claims}|} $$ ## 7. Advanced Techniques ### 7.1 Hybrid Search Combine dense and sparse retrieval: $$ \text{score}_{\text{hybrid}} = \alpha \cdot \text{score}_{\text{dense}} + (1 - \alpha) \cdot \text{score}_{\text{sparse}} $$ Where $\alpha \in [0, 1]$ is the weighting parameter. ### 7.2 Reranking Apply cross-encoder reranking to top-$k$ results: $$ \text{score}_{\text{rerank}}(Q, D) = \text{CrossEncoder}(Q, D) $$ Cross-encoder complexity: $O(k \cdot |Q| \cdot |D|)$ ### 7.3 Query Expansion - **HyDE (Hypothetical Document Embeddings)**: $$ \vec{q}_{\text{HyDE}} = E(\text{LLM}(Q)) $$ - **Multi-Query Retrieval**: $$ \mathcal{D}_{\text{merged}} = \bigcup_{i=1}^{m} \text{Retrieve}(Q_i) $$ ### 7.4 Contextual Compression Reduce retrieved context before generation: $$ C_{\text{compressed}} = \text{Compress}(\mathcal{D}_k, Q) $$ ## 8. Vector Database Options | Database | Index Types | Hosting | Scalability | |----------|-------------|---------|-------------| | Pinecone | HNSW, IVF | Cloud | High | | Weaviate | HNSW | Self/Cloud | High | | Qdrant | HNSW | Self/Cloud | High | | Milvus | IVF, HNSW | Self/Cloud | Very High | | FAISS | Flat, IVF, HNSW | Self | Medium | | Chroma | HNSW | Self | Low-Medium | | pgvector | IVFFlat, HNSW | Self | Medium | ## 9. Best Practices Checklist - Choose appropriate chunk size based on content type - Implement chunk overlap to preserve context - Store rich metadata for filtering - Use hybrid search for better recall - Implement reranking for precision - Monitor retrieval metrics continuously - Evaluate groundedness of generated responses - Handle edge cases (no results, low confidence) - Implement caching for common queries - Version control your knowledge base ## 10. Code Examples ### 10.1 Cosine Similarity (Python) ```python import numpy as np def cosine_similarity(vec_q: np.ndarray, vec_d: np.ndarray) -> float: """ Calculate cosine similarity between two vectors. $$\text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||}$$ """ dot_product = np.dot(vec_q, vec_d) norm_q = np.linalg.norm(vec_q) norm_d = np.linalg.norm(vec_d) return dot_product / (norm_q * norm_d) ``` ### 10.2 BM25 Implementation ```python import math from collections import Counter def bm25_score( query_terms: list[str], document: list[str], corpus: list[list[str]], k1: float = 1.5, b: float = 0.75 ) -> float: """ Calculate BM25 score for a query-document pair. """ doc_len = len(document) avg_doc_len = sum(len(d) for d in corpus) / len(corpus) doc_freq = Counter(document) N = len(corpus) score = 0.0 for term in query_terms: # Document frequency n_q = sum(1 for d in corpus if term in d) # IDF calculation idf = math.log((N - n_q + 0.5) / (n_q + 0.5) + 1) # Term frequency in document f_q = doc_freq.get(term, 0) # BM25 term score numerator = f_q * (k1 + 1) denominator = f_q + k1 * (1 - b + b * (doc_len / avg_doc_len)) score += idf * (numerator / denominator) return score ```
Framework for evaluating RAG.
RAGAS provides evaluation framework for retrieval augmented generation systems.
RAGAS evaluates RAG systems. Faithfulness, relevancy, context metrics.
Combine multiple DQN improvements.
Elevated floor creating space underneath for utilities cabling and air return.
Raised source-drain structures elevate junctions above channel reducing parasitic resistance and capacitance.
Spatial stress or composition mapping.
Analyze molecular vibrations and stress.
Heating/cooling speed.
Increase production from pilot to full scale.
Increase production volume of new process or product.
Random augmentation policies.
Random augmentation policy.
RandAugment is simple learned augmentation. Two hyperparameters.
Random defect distributions show no spatial correlation indicating particle contamination.
Unpredictable particle-caused defects.
Statistical variation in number and position of dopant atoms.
Statistical variation in dopant count.
Mask random patches.
Similar to CutOut for training.
Random failures occur unpredictably at constant rate during useful life period.
Approximate attention with random features.
Ensemble method for predicting yield.
Non-special boundary.
Random jitter follows Gaussian distribution from thermal noise having unbounded cumulative distribution.
Analyze neural networks using RMT.
Exploration bonus from prediction error.
Random routing assigns tokens to experts stochastically.
Add noise to routing decisions.
Random sampling selects units with equal probability ensuring unbiased representation.