SAT solving

SAT solving is the problem of determining whether a boolean formula can be satisfied — finding an assignment of true/false values to variables that makes the formula true, or proving that no such assignment exists, serving as the foundation for many automated reasoning and verification tasks.

What Is SAT?

- Boolean Formula: Logical expression with variables, AND (∧), OR (∨), NOT (¬).
- Example: (x ∨ y) ∧ (¬x ∨ z) ∧ (¬y ∨ ¬z)

- Satisfiability: Can we assign true/false to variables to make the formula true?

- SAT: Formula is satisfiable — there exists a satisfying assignment.

- UNSAT: Formula is unsatisfiable — no assignment makes it true.

Why SAT Solving?

- Fundamental Problem: SAT is the first problem proven NP-complete — many problems reduce to SAT.
- Practical Importance: Despite NP-completeness, modern SAT solvers are remarkably efficient on real-world instances.
- Versatility: SAT solving is used in verification, testing, planning, scheduling, and more.

CNF (Conjunctive Normal Form)

- Standard Form: Formula is AND of clauses, each clause is OR of literals.
- Clause: (x ∨ ¬y ∨ z)
- CNF: (x ∨ y) ∧ (¬x ∨ z) ∧ (¬y ∨ ¬z)

- Conversion: Any boolean formula can be converted to CNF.

- Why CNF?: SAT solvers work on CNF formulas — standard input format.

Example: SAT Problem

``Formula: (x ∨ y) ∧ (¬x ∨ z) ∧ (¬y ∨ ¬z)

Try x=true: Clause 1: (true ∨ y) = true ✓ Clause 2: (¬true ∨ z) = (false ∨ z) = z → Must have z=true Clause 3: (¬y ∨ ¬true) = (¬y ∨ false) = ¬y → Must have y=false

Check: (true ∨ false) ∧ (false ∨ true) ∧ (true ∨ false) = true ∧ true ∧ true = true ✓

Solution: x=true, y=false, z=true (SAT)`

DPLL Algorithm

- Classic SAT Algorithm: Backtracking search with optimizations.

- Steps: 1. Unit Propagation: If clause has only one unassigned literal, assign it to satisfy the clause. 2. Pure Literal Elimination: If variable appears only positive (or only negative), assign it to satisfy all clauses. 3. Branching: Pick unassigned variable, try both true and false. 4. Backtrack: If conflict, undo assignments and try alternative.

CDCL (Conflict-Driven Clause Learning)

- Modern SAT Solvers: Extend DPLL with learning.

- Key Idea: When conflict found, analyze to learn new clause preventing same conflict.

- Process: 1. Make decisions and propagate. 2. If conflict: Analyze conflict, learn clause, backtrack. 3. Learned clause prevents repeating same mistake. 4. Continue until SAT or UNSAT proven.

Example: CDCL Learning

`Formula: (x ∨ y) ∧ (¬x ∨ z) ∧ (¬y ∨ ¬z) ∧ (¬z ∨ w) ∧ (¬w)

Decisions: x=true, y=true Propagation: From (¬x ∨ z): z=true From (¬y ∨ ¬z): conflict! (y=true and z=true violate this)

Conflict Analysis: Why conflict? Because x=true → z=true and y=true → ¬z Learn clause: (¬x ∨ ¬y) # x and y can't both be true

Add learned clause to formula, backtrack, continue.`

Applications

- Hardware Verification: Verify chip designs — equivalence checking, property verification. - Software Verification: Bounded model checking, symbolic execution. - Planning: AI planning problems encoded as SAT. - Scheduling: Resource allocation, timetabling. - Cryptanalysis: Breaking cryptographic systems. - Bioinformatics: Haplotype inference, phylogeny.

SAT Solvers

- MiniSat: Small, efficient, widely used as baseline. - Glucose: Focuses on learned clause management. - CryptoMiniSat: Specialized for cryptographic problems. - Lingeling: Competition-winning solver. - CaDiCaL: Modern, efficient solver.

Example: Encoding Graph Coloring as SAT

`Problem: Color graph with 3 colors such that adjacent nodes have different colors.

Variables: x_i_c = "node i has color c" For 3 nodes, 3 colors: x_1_1, x_1_2, x_1_3, x_2_1, x_2_2, x_2_3, x_3_1, x_3_2, x_3_3

Constraints: 1. Each node has exactly one color: (x_1_1 ∨ x_1_2 ∨ x_1_3) ∧ (¬x_1_1 ∨ ¬x_1_2) ∧ (¬x_1_1 ∨ ¬x_1_3) ∧ (¬x_1_2 ∨ ¬x_1_3) ... (similar for nodes 2 and 3)

2. Adjacent nodes have different colors: If nodes 1 and 2 are adjacent: (¬x_1_1 ∨ ¬x_2_1) ∧ (¬x_1_2 ∨ ¬x_2_2) ∧ (¬x_1_3 ∨ ¬x_2_3)

SAT solver finds satisfying assignment → valid coloring.``

MaxSAT

- Optimization Variant: Maximize number of satisfied clauses.
- Partial MaxSAT: Some clauses are hard (must be satisfied), others are soft (prefer to satisfy).
- Applications: Optimization problems where not all constraints can be satisfied.

Incremental SAT

- Idea: Solve sequence of related SAT problems efficiently.
- Technique: Reuse learned clauses and solver state across problems.
- Applications: Bounded model checking, iterative refinement.

Challenges

- NP-Completeness: Worst-case exponential time.
- Hard Instances: Some formulas are extremely difficult for all known solvers.
- Encoding Quality: Efficiency depends on how problem is encoded as SAT.

SAT Solver Heuristics

- Variable Selection: Which variable to branch on? (VSIDS, EVSIDS)
- Phase Selection: Try true or false first? (Phase saving)
- Restart Strategy: When to restart search? (Luby, geometric)
- Clause Deletion: Which learned clauses to keep? (LBD, activity)

LLMs and SAT Solving

- Problem Encoding: LLMs can help translate problems into SAT formulas.
- Result Interpretation: LLMs can explain SAT solver results.
- Debugging UNSAT: LLMs can help identify conflicting constraints.
- Heuristic Tuning: LLMs can suggest solver configurations for specific problem types.

Benefits

- Automation: Automatically finds solutions or proves unsatisfiability.
- Efficiency: Modern solvers handle millions of variables and clauses.
- Versatility: Applicable to diverse problems via encoding.
- Mature Technology: Decades of research and engineering.

Limitations

- Exponential Worst Case: Some instances are intractable.
- Encoding Overhead: Translating problems to SAT can be complex.
- Black Box: Solvers don't explain why formula is UNSAT (though some provide UNSAT cores).

SAT solving is a cornerstone of automated reasoning — despite being NP-complete, modern SAT solvers are remarkably effective on real-world problems, making SAT solving essential for verification, testing, planning, and many other applications.

Want to learn more?