grouped convolution, model optimization
Grouped convolutions partition input channels into groups processing each independently reducing parameters.
9,967 technical terms and definitions
Grouped convolutions partition input channels into groups processing each independently reducing parameters.
GQA shares key-value heads across query heads. Reduces KV cache size. Llama 2 70B uses GQA.
Middle ground between MQA and multi-head attention.
Share K/V within groups.
Normalize within groups of channels.
Quantum search algorithm.
Drive AI adoption: onboarding, education, quick wins. Measure engagement and retention. Iterate on UX.
gRPC is high-performance RPC framework. HTTP/2, protobuf. Streaming support.
GRU4Rec applies gated recurrent units to session-based recommendation by modeling sequential user behavior within sessions.
Grounded compositional generalization.
Math word problems.
Grade School Math 8K tests mathematical reasoning on word problems.
Grade school math word problems.
Graph Transformer Networks learn new graph structures through soft edge selection for heterogeneous graphs.
Guanaco is QLoRA fine-tuned model. Efficient training on single GPU.
Safety margin in specifications.
Measure latchup protection.
Structures to prevent latch-up.
Guardbanding tightens test limits beyond specifications to reduce test escapes and improve shipped quality.
Add margin to account for variation.
Framework for adding structure validation and safety to LLM outputs.
Guardrails constrain model behavior preventing specific undesired outputs.
Guardrails prevent unwanted model behavior. Topic restrictions, format requirements, safety filters.
Strength of conditioning guidance.
Guidance scale controls trade-off between prompt adherence and sample diversity in guided generation.
Language for controlling generation from language models.
Guidance controls LLM output structure. Constrained generation. Microsoft.
Guided backpropagation modifies gradient flow to visualize features detected by neural networks.
Modified backprop for visualization.
Leads bent outward and down.
Body contact on both sides.
H-tree topology creates balanced clock distribution through recursive H-shaped branching.
Symmetric tree structure for clock.
A100/H100 are NVIDIA data center GPUs for AI. Huge VRAM + fast HBM + strong Tensor Cores = ideal for large LLM training and high-throughput inference.
Keep most important KV pairs.
Hybrid SSM+attention architecture.
Half the pitch used as resolution metric.
Halide separates algorithm from schedule enabling portable high-performance image processing.
Determine carrier concentration and mobility.
Identify false statements.
Generating false information.
Hallucination generates plausible-sounding but false or unsupported information.
Hallucination: model generates false info confidently. Mitigate with RAG, grounding, citations, and temperature 0.
Halo implants are angled pocket implants placed asymmetrically to suppress drain-induced barrier lowering.
Angled implant to reduce short-channel effects.
Software complexity measures.
Extreme stress test to find failure modes.
Discovery vs screening.
Highly Accelerated Life Test discovers design weaknesses through extreme combined stresses.
Hierarchical RL framework.