Guidance

Keywords: guidance,framework

Guidance is the constraint-based language model programming framework by Microsoft that enables precise control over LLM output structure through interleaved generation and templating — allowing developers to define exact output formats with variables, conditionals, loops, and regex constraints that the model must follow during generation, eliminating post-processing and reducing hallucination through structural enforcement.

What Is Guidance?

- Definition: A Python library that combines templating with constrained generation, letting developers interleave fixed text, LLM generation, and programmatic logic in a single program.
- Core Innovation: Generation happens within structural constraints — the model can only produce tokens that satisfy the specified format.
- Key Difference: Unlike prompt engineering (hoping for the right format), Guidance enforces format through constrained decoding.
- Creator: Microsoft Research, led by Scott Lundberg.

Why Guidance Matters

- Guaranteed Structure: Output always matches the specified format — no parsing failures or format errors.
- Reduced Hallucination: Structural constraints limit the model's generation space, reducing opportunities for hallucination.
- Efficiency: Single forward pass generates structured output — no retry loops or post-processing needed.
- Interleaved Logic: Mix generation with Python code execution, conditionals, and loops within a single program.
- Token Efficiency: Only generate variable content — fixed template text is injected without using tokens.

Core Features

| Feature | Description | Benefit |
|---------|-------------|---------|
| Templates | Jinja-style templates with generation blocks | Structured output |
| Select | Constrain output to specific choices | Guaranteed valid enum values |
| Regex | Match generation against regex patterns | Format enforcement |
| Gen | Free-form generation within constraints | Controlled creativity |
| If/For | Programmatic control flow | Dynamic output structure |

How Guidance Works

Programs are written as templates where `{{gen}} blocks indicate where the model generates text, {{select}} blocks constrain choices, and Python logic controls flow. The model generates tokens that satisfy all active constraints, producing correctly structured output in a single pass.

Example Patterns

- Structured Extraction: Force output into JSON with specific field types.
- Classification: Constrain output to valid class labels using
select`.
- Chain-of-Thought: Alternate between reasoning generation and structured answer extraction.
- Multi-Step: Use loops to generate lists of items with consistent formatting.

Guidance is the most precise tool for controlling LLM output structure — replacing the unreliability of prompt-based formatting with guaranteed structural compliance through constrained decoding, making it essential for applications where output format correctness is non-negotiable.

Want to learn more?

Search 13,225+ semiconductor and AI topics or chat with our AI assistant.

Search Topics Chat with CFSGPT