Understanding the Power of Clause Learning Ashish Sabharwal

Understanding the Power of Clause Learning Ashish Sabharwal, Paul Beame, Henry Kautz University of Washington, Seattle IJCAI Conference Aug 14, 2003

SAT: The Satisfiability Problem Given a CNF formula F, [e. g. F = (a b) ( a c) a ] determine if F has a satisfying assignment Your favorite problem CNF formula YES + a solution NO SAT Encoder SAT Solver 2

SAT Solver: DPLL Algorithm Best complete CNF (un)satisfiability algorithms in practice are extensions of the DPLL algorithm [Davis-Putnam 60] [Davis-Logeman-Loveland 62] • Recursive backtrack search • Search space pruning based on falsified clauses 3

DPLL Algorithm DPLL(F) // Perform unit propagation while exists unit clause (y) F F = F|y Remove all clauses containing y Shrink all clauses containing y if F is empty, report satisfiable and halt if F contains the empty clause L, return else choose a literal x DPLL(F|x) DPLL(F| x) 4

Extending DPLL: Clause Learning When backtracking in DPLL, add new clauses corresponding to causes of failure of the search EBL [Kleer-Williams 87, Stallman-Sussman 77, Genesereth 84, Davis 84] CL [Bayardo-Schrag 97, Marques. Silva-Sakallah 96, Zhang 97, Moskewicz et al. 01, Zhang et al. 01] Added conflict clauses – Capture reasons of conflicts – Obtained via unit propagations from known ones – Reduce future search by producing conflicts sooner 5

Conflict Graphs Known Clauses (p q a) ( a b t) (t x 1) (t x 2) (t x 3) (x 1 x 2 x 3 y) (x 2 y) 1 -UIP scheme t x 1 p q y a t b Current decisions p = false q = false b = true Decision scheme (p q b) x 2 false y x 3 Our New Scheme: First. New. Cut scheme (x 1 x 2 x 3) 6

Restarts Clause Learning (CL) algorithms can be restarted at any point [Baptista-Silva 00] – Unset all variables and start over – But retain all clauses learned so far • Avoids getting stuck in one part of the search space • Evidently adds power to CL 7

CL Critical to Performance Best current SAT algorithms rely heavily on CL for good behavior on real world problems GRASP [Marques. Silva-Sakallah 96], SATO [H. Zhang 97] z. Chaff [Moskewicz et al. 01], Berkmin [Goldberg-Novikov 02] However, Ø No good understanding of its strengths and weaknesses Ø Not much insight on why it works when it does 8

Our Contribution • Mathematical framework for analyzing clause learning • Characterization of its power in relation to well-studied topics in proof complexity theory • Ways to improve solver performance based on formal analysis 9

Proofs of Unsatisfiability When F is unsatisfiable, • DPLL refutation of F is a proof of its unsatisfiability • Size lower bound on proofs of F gives time lower bound on executions of DPLL(F) (with or w/o learning depending on the class of proofs) • Size upper bound on proofs of F gives potential for quick executions of such algorithms (with the best possible branching heuristic, the best learning scheme, etc. ) 10

What about Satisfiable Formulas? [Achlioptas-Beame-Molloy 01] DPLL Tree Bounds on unsatisfiable formulas imply bounds on satisfiable ones! Unsatisfiable sub-formula Satisfying assignment 11

Proof System: Resolution F = (a b) ( a c) a ( b c) (a c) Unsatisfiable CNF formula L empty clause Proof size = 9 c c (b c) (a b) ( a c) a ( b c) (a c) 12

Special Cases of Resolution Tree-like resolution – Graph of inferences forms a tree DPLL Regular resolution – Variable can be resolved on only once on any path from input to empty clause Directed acyclic graph analog of DPLL tree – Natural to not branch on a variable once it has been eliminated – Used in original DP [Davis-Putnam 60] 13

Frege systems … Space of polynomial time solvable formulas … Proof System Hierarchy Pigeonhole principle [Haken 85] [Alekhnovich et al. 02] [Bonet et al. 00] General RES Regular RES DPLL = Tree-like 14

Our Results General RES Regular RES DPLL = Tree-like CL w/o restarts DPLL = Tree-like Trivial RES = Learned clauses 15

Thm 1. CL can beat Regular RES Formula f • Poly-size RES proof • Exp-size Regular proof General RES Formula PT(f, ) • Poly-size CL proof • Exp-size Regular proof Regular RES CL w/o restarts Regular RES Such formulas exist! GTn Ordering principle Pebbling formulas [Alekhnovich et al. 02] 16

PT(f, ): Proof Trace Extension Start with • unsatisfiable formula f with poly-size RES proof PT(f, ) contains • All clauses of f • For each derived clause Q=(a b c) in , – Trace variable t. Q – New clauses (t. Q a), (t. Q b), (t. Q c) CL proof of PT(f, ) works by branching negatively on t. Q’s in bottom up order of clauses of 17

PT(f, ): Proof Trace Extension Formula f RES proof PT(f, ) (t. Q a) (t. Q b) (t. Q c) Q (a b c) (a b x) • Trace variable t. Q • New clauses a (c x) t. Q b c … … x false x First. New. Cut (a b c) 18

How hard is PT(f, )? Easy for CL: by construction CL branches exactly once on each trace variable # branches = size( ) Hard for Regular RES: reduction argument § Fact 1: PT(f, )|Trace. Vars = true f § Fact 2: If is a Regular RES proof of g, then |x is a Regular RES proof of g|x § Fact 3: f does not have small Regular proofs! 19

Implications? DPLL algorithms w/o clause learning are hopeless for some structured formulas CL algorithms have potential for small proofs Can we use such analysis to harness this potential? 20

Branching Sequence B = (x 1, x 4, x 3, x 1, x 8, x 2, x 4, x 7, x 1) • DPLL picks branching literals from B • Repetitions allowed • Different from “branching order” How “good” is B? – Depends on backtracking process, learning scheme, etc 21

From Analysis to Practice Maximum size of grid pebbling formulas solvable by z. Chaff (in 1 day) can be substantially increased! Results extend to general randomized pebbling formulas [Sabharwal-Beame-Kautz SAT 2003] 22

Trivial RES Known clauses: (a b c), ( a x), ( b c), ( c y) Derived clause: (x y) (c x) (b c x) ( c y) ( b c) § Simple ladder structure § Distinct variables resolved upon § Properties: o Tree-like o Regular o Linear (a b c) ( a x) Clauses learned by CL are those that can be derived by trivial RES 23

Our Results Regular RES Trivial RES = Learned clauses CL w/o restarts DPLL = Tree-like 24

CL-- A Variant of CL Allow branching on variables whose value is already implied by unit propagation – Equivalently, allow branching on variables not appearing at all in the residual formula How can this possibly help? Can learn a conflict clause that will reduce search later! 25

Error in Paper Incorrect Theorem 1 in the paper: “CL + restarts is equivalent to RES” Correct version: “CL-- + restarts is equivalent to RES” 26

Thm 2. CL-- + restarts = General RES • General RES can simulate CL + restarts Ø CL learns by unit propagation Ø Infer learned clauses using RES derivation • CL-- + restarts can simulate General RES Ø If RES proof resolves (A x) and (B x) to obtain C, branch and set all literals of A and B to false, learn C, restart Ø Eventually learn empty clause 27

Our Results General RES = CL-- + restarts Regular RES Trivial RES = Learned clauses CL w/o restarts DPLL = Tree-like 28

Summary • Formal framework • CL can be stronger than Regular RES – First. New. Cut learning scheme • CL-- + restarts is equivalent to RES 29

Open Problems Fill in complexity hierarchy gaps 1. Can CL efficiently simulate Regular RES? 2. Can CL + restarts efficiently simulate RES? From Analysis to Practice Can we use this framework to improve SAT solvers for other classes of structured formulas? E. g. Planning as satisfiability Bounded model checking 30