← Back to AI Factory Chat

AI Factory Glossary

536 technical terms and definitions

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z All
Showing page 1 of 11 (536 entries)

r-gcn, r-gcn, graph neural networks

Relational Graph Convolutional Networks extend GCNs to multi-relational data by learning separate transformation matrices for each edge type.

r-squared, quality & reliability

R-squared measures proportion of response variation explained by model.

r&d fab,production

Fab dedicated to research and development.

r2r (run-to-run control),r2r,run-to-run control,process

Adjust process recipe between runs based on previous results.

race, race, evaluation

Educational exam questions.

race, race, evaluation

ReAding Comprehension from Examinations tests understanding of passages.

racial bias, evaluation

Racial bias shows as disparate treatment or stereotypes based on race or ethnicity.

rad-tts, rad-tts, audio & speech

RAD-TTS uses normalizing flows with learned prior for robust alignment-free text-to-speech.

radam, optimization

Adam with variance warmup.

rademacher complexity, advanced training

Rademacher complexity bounds generalization error by measuring how well a function class can fit random noise.

radial effects, manufacturing

Variation from center to edge.

radial pattern, yield enhancement

Radial patterns on wafer maps emanate from center suggesting spin-coating or thermal gradient issues.

radiation heat transfer, thermal management

Radiation heat transfer becomes significant at high temperatures contributing to cooling in open environments.

radiative recombination, device physics

Recombination emitting photons.

radiology report generation,healthcare ai

Describe medical images in text.

raft, raft, video understanding

Iterative refinement for flow.

rag evaluation frameworks, rag, evaluation

Systematic RAG assessment.

rag-sequence,rag

Single retrieval for entire sequence.

rag-token,rag

Retrieve for each generated token.

rag, retrieval, knowledge base, retrieval augmented generation, vector search, embeddings, semantic search, llm

# RAG, Retrieval, and Knowledge Bases A comprehensive technical guide with mathematical foundations ## 1. Overview **RAG (Retrieval-Augmented Generation)** is an architecture that enhances Large Language Models (LLMs) by grounding their responses in external knowledge sources. ### Core Components - **Generator**: The LLM that produces the final response - **Retriever**: The system that finds relevant documents - **Knowledge Base**: The corpus of documents being searched ## 2. Mathematical Foundations ### 2.1 Vector Embeddings Documents and queries are converted to dense vectors in $\mathbb{R}^d$ where $d$ is the embedding dimension (typically 384, 768, or 1536). **Embedding Function:** $$ E: \text{Text} \rightarrow \mathbb{R}^d $$ For a document $D$ and query $Q$: $$ \vec{d} = E(D) \in \mathbb{R}^d $$ $$ \vec{q} = E(Q) \in \mathbb{R}^d $$ ### 2.2 Similarity Metrics #### Cosine Similarity $$ \text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||} = \frac{\sum_{i=1}^{d} q_i \cdot d_i}{\sqrt{\sum_{i=1}^{d} q_i^2} \cdot \sqrt{\sum_{i=1}^{d} d_i^2}} $$ #### Euclidean Distance (L2) $$ \text{dist}_{L2}(\vec{q}, \vec{d}) = ||\vec{q} - \vec{d}||_2 = \sqrt{\sum_{i=1}^{d} (q_i - d_i)^2} $$ #### Dot Product $$ \text{sim}_{\text{dot}}(\vec{q}, \vec{d}) = \vec{q} \cdot \vec{d} = \sum_{i=1}^{d} q_i \cdot d_i $$ ### 2.3 BM25 (Sparse Retrieval) $$ \text{BM25}(Q, D) = \sum_{i=1}^{n} \text{IDF}(q_i) \cdot \frac{f(q_i, D) \cdot (k_1 + 1)}{f(q_i, D) + k_1 \cdot \left(1 - b + b \cdot \frac{|D|}{\text{avgdl}}\right)} $$ Where: - $f(q_i, D)$ = frequency of term $q_i$ in document $D$ - $|D|$ = document length - $\text{avgdl}$ = average document length in corpus - $k_1$ = term frequency saturation parameter (typically 1.2–2.0) - $b$ = length normalization parameter (typically 0.75) **Inverse Document Frequency (IDF):** $$ \text{IDF}(q_i) = \ln\left(\frac{N - n(q_i) + 0.5}{n(q_i) + 0.5} + 1\right) $$ Where: - $N$ = total number of documents - $n(q_i)$ = number of documents containing $q_i$ ## 3. RAG Pipeline Architecture ### 3.1 Pipeline Stages 1. **Indexing Phase** - Document ingestion - Chunking strategy selection - Embedding generation - Vector storage 2. **Query Phase** - Query embedding: $\vec{q} = E(Q)$ - Top-$k$ retrieval: $\mathcal{D}_k = \text{argmax}_{D \in \mathcal{C}}^k \text{sim}(\vec{q}, \vec{d})$ - Context assembly - LLM generation ### 3.2 Retrieval Formula Given a query $Q$ and corpus $\mathcal{C}$, retrieve top-$k$ documents: $$ \mathcal{D}_k = \{D_1, D_2, ..., D_k\} \quad \text{where} \quad \text{sim}(Q, D_1) \geq \text{sim}(Q, D_2) \geq ... \geq \text{sim}(Q, D_k) $$ ### 3.3 Generation with Context $$ P(\text{Response} | Q, \mathcal{D}_k) = \text{LLM}(Q \oplus \mathcal{D}_k) $$ Where $\oplus$ denotes context concatenation. ## 4. Chunking Strategies ### 4.1 Fixed-Size Chunking - **Chunk size**: $c$ tokens (typically 256–1024) - **Overlap**: $o$ tokens (typically 10–20% of $c$) $$ \text{Number of chunks} = \left\lceil \frac{|D| - o}{c - o} \right\rceil $$ ### 4.2 Semantic Chunking - Split by semantic boundaries (paragraphs, sections) - Use sentence embeddings to detect topic shifts - Threshold: $\theta$ for similarity drop detection $$ \text{Split at } i \quad \text{if} \quad \text{sim}(s_i, s_{i+1}) < \theta $$ ### 4.3 Recursive Chunking - Hierarchical splitting: Document → Sections → Paragraphs → Sentences - Maintains context hierarchy ## 5. Knowledge Base Design ### 5.1 Metadata Schema ```json { "chunk_id": "string", "document_id": "string", "content": "string", "embedding": "vector[d]", "metadata": { "source": "string", "title": "string", "author": "string", "date_created": "ISO8601", "date_modified": "ISO8601", "section": "string", "page_number": "integer", "chunk_index": "integer", "total_chunks": "integer", "tags": ["string"], "confidence_score": "float" } } ``` ### 5.2 Index Types - **Flat Index**: Exact search, $O(n)$ complexity - **IVF (Inverted File)**: Approximate, $O(\sqrt{n})$ complexity - **HNSW (Hierarchical Navigable Small World)**: Graph-based, $O(\log n)$ complexity **HNSW Search Complexity:** $$ O(d \cdot \log n) $$ Where $d$ is embedding dimension and $n$ is corpus size. ## 6. Evaluation Metrics ### 6.1 Retrieval Metrics #### Recall@k $$ \text{Recall@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{|\text{Relevant}|} $$ #### Precision@k $$ \text{Precision@}k = \frac{|\text{Relevant} \cap \text{Retrieved@}k|}{k} $$ #### Mean Reciprocal Rank (MRR) $$ \text{MRR} = \frac{1}{|Q|} \sum_{i=1}^{|Q|} \frac{1}{\text{rank}_i} $$ #### Normalized Discounted Cumulative Gain (NDCG) $$ \text{DCG@}k = \sum_{i=1}^{k} \frac{2^{\text{rel}_i} - 1}{\log_2(i + 1)} $$ $$ \text{NDCG@}k = \frac{\text{DCG@}k}{\text{IDCG@}k} $$ ### 6.2 Generation Metrics - **Faithfulness**: Is response grounded in retrieved context? - **Relevance**: Does response answer the query? - **Groundedness Score**: $$ G = \frac{|\text{Claims supported by context}|}{|\text{Total claims}|} $$ ## 7. Advanced Techniques ### 7.1 Hybrid Search Combine dense and sparse retrieval: $$ \text{score}_{\text{hybrid}} = \alpha \cdot \text{score}_{\text{dense}} + (1 - \alpha) \cdot \text{score}_{\text{sparse}} $$ Where $\alpha \in [0, 1]$ is the weighting parameter. ### 7.2 Reranking Apply cross-encoder reranking to top-$k$ results: $$ \text{score}_{\text{rerank}}(Q, D) = \text{CrossEncoder}(Q, D) $$ Cross-encoder complexity: $O(k \cdot |Q| \cdot |D|)$ ### 7.3 Query Expansion - **HyDE (Hypothetical Document Embeddings)**: $$ \vec{q}_{\text{HyDE}} = E(\text{LLM}(Q)) $$ - **Multi-Query Retrieval**: $$ \mathcal{D}_{\text{merged}} = \bigcup_{i=1}^{m} \text{Retrieve}(Q_i) $$ ### 7.4 Contextual Compression Reduce retrieved context before generation: $$ C_{\text{compressed}} = \text{Compress}(\mathcal{D}_k, Q) $$ ## 8. Vector Database Options | Database | Index Types | Hosting | Scalability | |----------|-------------|---------|-------------| | Pinecone | HNSW, IVF | Cloud | High | | Weaviate | HNSW | Self/Cloud | High | | Qdrant | HNSW | Self/Cloud | High | | Milvus | IVF, HNSW | Self/Cloud | Very High | | FAISS | Flat, IVF, HNSW | Self | Medium | | Chroma | HNSW | Self | Low-Medium | | pgvector | IVFFlat, HNSW | Self | Medium | ## 9. Best Practices Checklist - Choose appropriate chunk size based on content type - Implement chunk overlap to preserve context - Store rich metadata for filtering - Use hybrid search for better recall - Implement reranking for precision - Monitor retrieval metrics continuously - Evaluate groundedness of generated responses - Handle edge cases (no results, low confidence) - Implement caching for common queries - Version control your knowledge base ## 10. Code Examples ### 10.1 Cosine Similarity (Python) ```python import numpy as np def cosine_similarity(vec_q: np.ndarray, vec_d: np.ndarray) -> float: """ Calculate cosine similarity between two vectors. $$\text{sim}_{\cos}(\vec{q}, \vec{d}) = \frac{\vec{q} \cdot \vec{d}}{||\vec{q}|| \cdot ||\vec{d}||}$$ """ dot_product = np.dot(vec_q, vec_d) norm_q = np.linalg.norm(vec_q) norm_d = np.linalg.norm(vec_d) return dot_product / (norm_q * norm_d) ``` ### 10.2 BM25 Implementation ```python import math from collections import Counter def bm25_score( query_terms: list[str], document: list[str], corpus: list[list[str]], k1: float = 1.5, b: float = 0.75 ) -> float: """ Calculate BM25 score for a query-document pair. """ doc_len = len(document) avg_doc_len = sum(len(d) for d in corpus) / len(corpus) doc_freq = Counter(document) N = len(corpus) score = 0.0 for term in query_terms: # Document frequency n_q = sum(1 for d in corpus if term in d) # IDF calculation idf = math.log((N - n_q + 0.5) / (n_q + 0.5) + 1) # Term frequency in document f_q = doc_freq.get(term, 0) # BM25 term score numerator = f_q * (k1 + 1) denominator = f_q + k1 * (1 - b + b * (doc_len / avg_doc_len)) score += idf * (numerator / denominator) return score ```

ragas, ragas, evaluation

Framework for evaluating RAG.

ragas, ragas, rag

RAGAS provides evaluation framework for retrieval augmented generation systems.

ragas,rag,evaluation

RAGAS evaluates RAG systems. Faithfulness, relevancy, context metrics.

rainbow dqn, reinforcement learning

Combine multiple DQN improvements.

raised floor,facility

Elevated floor creating space underneath for utilities cabling and air return.

raised source-drain, process integration

Raised source-drain structures elevate junctions above channel reducing parasitic resistance and capacitance.

raman mapping, metrology

Spatial stress or composition mapping.

raman spectroscopy,metrology

Analyze molecular vibrations and stress.

ramp rate, packaging

Heating/cooling speed.

ramp to volume, production

Increase production from pilot to full scale.

ramp,production

Increase production volume of new process or product.

randaugment for vit, computer vision

Random augmentation policies.

randaugment, data augmentation

Random augmentation policy.

randaugment,simple,augment

RandAugment is simple learned augmentation. Two hyperparameters.

random defect distribution, manufacturing operations

Random defect distributions show no spatial correlation indicating particle contamination.

random defects,metrology

Unpredictable particle-caused defects.

random dopant fluctuation (rdf),random dopant fluctuation,rdf,manufacturing

Statistical variation in number and position of dopant atoms.

random dopant fluctuations, device physics

Statistical variation in dopant count.

random erasing in vit, computer vision

Mask random patches.

random erasing, data augmentation

Similar to CutOut for training.

random failure, business & standards

Random failures occur unpredictably at constant rate during useful life period.

random feature attention,llm architecture

Approximate attention with random features.

random forest for yield prediction, data analysis

Ensemble method for predicting yield.

random grain boundary, defects

Non-special boundary.

random jitter, signal & power integrity

Random jitter follows Gaussian distribution from thermal noise having unbounded cumulative distribution.

random matrix theory, theory

Analyze neural networks using RMT.

random network distillation, rnd, reinforcement learning

Exploration bonus from prediction error.

random routing, llm architecture

Random routing assigns tokens to experts stochastically.

random routing, moe

Add noise to routing decisions.

random sampling, quality & reliability

Random sampling selects units with equal probability ensuring unbiased representation.