Heuristics for Efficient SAT Solving As implemented in

? Why SAT • Fundamental problem from theoretical point of view – Cook theorem,

Agenda • • Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking

Agenda n n n n Modeling problems in Propositional Logic SAT basics Decision heuristics

CNF-SAT n Conjunctive Normal Form: Conjunction of disjunction of literals. Example: (: x 1

)CNF) SAT basic definitions: literals n n n A literal is a variable or

SAT basic definitions: literals n n If var(l) is unassigned, then l is unresolved.

SAT basic definitions: clauses n The state of an n-long clause C under a

SAT basic definitions: clauses n Example

SAT basic definitions: the unit clause rule n The unit clause rule: in a

Basic Backtracking Search Organize the search in the form of a decision tree n

Backtracking Search in Action x 1 = 0@1 x 2 = 0@2 x 2

Backtracking Search in Action x 1 = 1@1 x 4 = 0@1 x 2

A Basic SAT algorithm (DPLL-based) Choose the next variable and value. Return False if

Decision heuristics DLIS (Dynamic Largest Individual Sum) n n n Maintain a counter for

Decision heuristics Jeroslow-Wang method Compute for every clause and every literal l: n n

Decision heuristics MOM (Maximum Occurrence of clauses of Minimum size). n n n Let

Pause. . . n n We will see other (more advanced) decision Heuristics soon.

Implication graphs and learning Current truth assignment: {x 9=0@1 , x 10=0@3, x 11=0@3,

Implication graph, flipped assignment 1 = ( x 1 x 2) x 13=1@2 2

Non-chronological backtracking 3 Which assignments caused the conflicts ? x 9= 0@1 x 10=

Non-chronological backtracking (option #1) n n So the rule is: backtrack to the largest

Non-chronological Backtracking x 1 = 0 x 2 = 0 x 3 = 1

More Conflict Clauses n n n Def: A Conflict Clause is any clause implied

Conflict clauses n n How many clauses should we add ? If not all,

Asserting clauses n n Def: An Asserting Clause is a Conflict Clause with a

Unique Implication Points (UIP’s( n n n Def: A Unique Implication Point (UIP) is

Conflict-driven backtracking (option #2) n Previous method: backtrack to highest decision level in conflict

Conflict-driven Backtracking x 1 = 0 x 2 = 0 x 5 = 1

Conflict-Driven Backtracking n So the rule is: backtrack to the second highest decision level

Progress of a SAT solver work invested in refuting x=1 (some of it seems

Conflict clauses and Resolution n The Binary-resolution is a sound inference rule: n Example

Conflict clauses and resolution n Consider the following example: n Conflict clause: c 5:

Conflict clauses and resolution n Conflict clause: c 5: (x 2 Ç : x

Finding the conflict clause: cl is asserting the first UIP Applied to our example:

The Resolution-Graph keeps track of the “inference relation” Resolution Graph 1 4 3 3

The resolution graph What is it good for ? Example: for computing an Unsatisfiable

Resolution graph: example Empty clause Inferred clauses learning L : Original clauses Unsatisfiable core

Decision heuristics VSIDS (Variable State Independent Decaying Sum) 1. Each variable in each polarity

Decision heuristics VSIDS (cont’d) • Chaff holds a list of unassigned variables sorted by

Decision heuristics VSIDS (cont’d) VSIDS is a ‘quasi-static’ strategy: - static because it doesn’t

Decision Heuristics - Berkmin n n Keep conflict clauses in a stack Choose the

Berkmin heuristic tailfirst conflict clause

More engineering aspects of SAT solvers Observation: More than 90% of the time SAT

Grasp implements Deduction() with counters Hold 2 counters for each clause : val 1(

Grasp implements Deduction() with counters is satisfied iff val 1( ) > 0 is

Chaff implements Deduction() with a pair of observers n n n Observation: during Deduction(),

Chaff implements Deduction() with a pair of observers n Define two ‘observers’: O 1(

Chaff implements Deduction() with a pair of observers O 1 O 2 V[2]=0 V[1]=0

Chaff implements Deduction() with a pair of observers The choice of observing literals is

GSAT: stochastic SAT solving Given a CNF formula , choose max_tries and max_flips for

Numerous progressing heuristics n n n Hill-climbing Tabu-list Simulated-annealing Random-Walk Min-conflicts. . .

Improvement # 1: clause weights Initial weight of each clause: 1 Increase by k

Improvement # 2: Averaging-in Q: Can we reuse information gathered in previous tries in

Improvement # 2: Averaging-in (cont’d) Let X 1, X 2 and X 3 be

Improvement # 2: Averaging-in (cont’d) Let Tiinit be the initial assignment (T) in cycle

Slides: 63

Download presentation

Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.

? Why SAT • Fundamental problem from theoretical point of view – Cook theorem, 1971: the first NP-complete problem. • Numerous applications: – – – Solving any NP problem. . . Verification: Model Checking, theorem-proving, . . . AI: Planning, automated deduction, . . . Design and analysis: CAD, VLSI Physics: statistical mechanics (models for spin-glass material)

…SAT made some progress

The SAT competitions

Agenda • • Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

Agenda n n n n Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

CNF-SAT n Conjunctive Normal Form: Conjunction of disjunction of literals. Example: (: x 1 Ç : x 2) Æ (x 2 Ç x 4 Ç : x 1) Æ. . . n n Experience shows that CNF-SAT solving is faster than solving a general propositional formula. There exists a polynomial transformation due to Tseitin (1970) of a general propositional formula to CNF, with addition of | | variables.

)CNF) SAT basic definitions: literals n n n A literal is a variable or its negation. Var(l) is the variable associated with a literal l. A literal is called negative if it is a negated variable, and positive otherwise.

SAT basic definitions: literals n n If var(l) is unassigned, then l is unresolved. Otherwise, l is satisfied by an assignment if (var(l)) = 1 and l is positive, or (var(l)) = 0 and l is negative, and unsatisfied otherwise.

SAT basic definitions: clauses n The state of an n-long clause C under a partial assignment is: n n Satisfied if at least one of C’s literals is satisfied, Conflicting if all of C’s literals are unsatisfied, Unit if n-1 literals in C are unsatisfied and 1 literal is unresolved, and Unresolved otherwise.

SAT basic definitions: clauses n Example

SAT basic definitions: the unit clause rule n The unit clause rule: in a unit clause the unresolved literal must be satisfied.

Basic Backtracking Search Organize the search in the form of a decision tree n n Each node corresponds to a decision Depth of the node in the decision tree is called the decision level Notation: x=v@d x is assigned v 2 {0, 1} at decision level d

Backtracking Search in Action x 1 = 0@1 x 2 = 0@2 x 2 x 3 = 1@2 {(x 1, 0), (x 2, 0), (x 3, 1)} 1 = (x 2 x 3) 2 = ( x 1 x 4) 3 = ( x 2 x 4) x 1 = 1@1 x 4 = 0@1 x 2 = 0@1 x 3 = 1@1 {(x 1, 1), (x 2, 0), (x 3, 1) , (x 4, 0)} No backtrack in this example, regardless of the decision!

Backtracking Search in Action x 1 = 1@1 x 4 = 0@1 x 2 = 0@1 x 3 = 1@1 conflict Add a clause 1 = (x 2 x 3) x 1 = 0@1 x 2 2 = ( x 1 x 4) 3 = ( x 2 x 4) 4 = ( x 1 x 2 x 3) x 2 = 0@2 x 3 = 1@2 {(x 1, 0), (x 2, 0), (x 3, 1)}

A Basic SAT algorithm (DPLL-based) Choose the next variable and value. Return False if all variables are assigned While (true) { if (!Decide()) return (SAT); while (!Deduce()) } if (!Resolve_Conflict()) return (UNSAT); Apply unit clause rule. Return False if reached a conflict Backtrack until no conflict. Return False if impossible

Decision heuristics DLIS (Dynamic Largest Individual Sum) n n n Maintain a counter for each literal: in how many unresolved clauses it appears ? Decide on the literal with the largest counter. Requires O(#literals) queries for each decision.

Decision heuristics Jeroslow-Wang method Compute for every clause and every literal l: n n n J(l) : = Choose a variable l that maximizes J(l). This gives an exponentially higher weight to literals in shorter clauses.

Decision heuristics MOM (Maximum Occurrence of clauses of Minimum size). n n n Let f*(x) be the # of unresolved smallest clauses containing x. Choose x that maximizes: ((f*(x) + f*(!x)) * 2 k + f*(x) * f*(!x) k is chosen heuristically. The idea: n Give preference to satisfying small clauses. n Among those, give preference to balanced variables (e. g. f*(x) = 3, f*(!x) = 3 is better than f*(x) = 1, f*(!x) = 5).

Pause. . . n n We will see other (more advanced) decision Heuristics soon. These heuristics are integrated with a mechanism called Learning with Conflict. Clauses, which we will learns next.

Implication graphs and learning Current truth assignment: {x 9=0@1 , x 10=0@3, x 11=0@3, x 12=1@2, x 13=1@2} Current decision assignment: {x 1=1@6} x 10=0@3 1 = ( x 1 x 2) x 2=1@6 2 = ( x 1 x 3 x 9) 1 3 = ( x 2 x 3 x 4) 4 = ( x 4 x 5 x 10) 5 = ( x 4 x 6 x 11) 6 = ( x 5 x 6) 7 = (x 1 x 7 x 12) 8 = (x 1 x 8) 9 = ( x 7 x 8 x 13) x 1=1@6 2 2 x 9=0@1 3 3 x 3=1@6 4 4 x 4=1@6 5 5 x 5=1@6 6 6 conflict x 6=1@6 x 11=0@3 We learn the conflict clause 10 : (: x 1 Ç x 9 Ç x 11 Ç x 10) and backtrack to the highest (deepest) dec. level in this clause (6).

Implication graph, flipped assignment 1 = ( x 1 x 2) x 13=1@2 2 = ( x 1 x 3 x 9) 3 = ( x 2 x 3 x 4) x 8=1@6 x 9=0@1 4 = ( x 4 x 5 x 10) 10 x 10=0@3 5 = ( x 4 x 6 x 11) 10 6 = ( x 5 x 6) 7 = (x 1 x 7 x 12) x 1=0@6 10 x 11=0@3 8 = (x 1 x 8) Due to the conflict clause 9 = ( x 7 x 8 x 13) 8 7 7 9 9 9 ’ x 7=1@6 x 12=1@2 10 : (: x 1 Ç x 9 Ç x 11 Ç x 10) We learn the conflict clause 11 : (: x 13 Ç x 9 Ç x 10 Ç x 11 Ç : x 12) and backtrack to the highest (deepest) dec. level in this clause (3).

Non-chronological backtracking 3 Which assignments caused the conflicts ? x 9= 0@1 x 10= 0@3 x 11= 0@3 x 12= 1@2 x 13= 1@2 Decision level 4 5 These assignments Are sufficient for Causing a conflict. x 1 Backtrack to decision level 3 6 ’ Nonchronological backtracking

Non-chronological backtracking (option #1) n n So the rule is: backtrack to the largest decision level in the conflict clause. Q: What if the flipped assignment works ? A: continue to the next decision level, leaving the current one without a decision variable. n Backtracking back to this level will lead to another conflict and further backtracking.

Non-chronological Backtracking x 1 = 0 x 2 = 0 x 3 = 1 x 3 = 0 x 4 = 0 x 5 = 0 x 6 = 0. . . x 5 = 1 x 7 = 1 x 9 = 0 x 9 = 1

More Conflict Clauses n n n Def: A Conflict Clause is any clause implied by the formula Let L be a set of literals labeling nodes that form a cut in the implication graph, separating the conflict node from the roots. Claim: Çl 2 L: l is a Conflict Clause. x 10=0@3 x 2=1@6 1 x 1=1@6 2 2 x 9=0@1 3 3 x 3=1@6 2 4 1 x 5=1@6 4 6 x 4=1@6 6 5 5 x 11=0@3 2. (x 10 Ç : x 4 Ç x 11) conflict x 6=1@6 1. (x 10 Ç : x 1 Ç x 9 Ç x 11) 3. (x 10 Ç : x 2 Ç : x 3 Ç x 11) 3 Skip alternative learning

Conflict clauses n n How many clauses should we add ? If not all, then which ones ? n n Shorter ones ? Check their influence on the backtracking level ? The most “influential” ? The answer requires two definitions: n n Asserting clauses Unique Implication points (UIP’s)

Asserting clauses n n Def: An Asserting Clause is a Conflict Clause with a single literal from the current decision level. Backtracking (to the right level) makes it a Unit clause. Modern solvers only consider Asserting Clauses.

Unique Implication Points (UIP’s( n n n Def: A Unique Implication Point (UIP) is an internal node in the Implication Graph that all paths from the decision to the conflict node go through it. The First-UIP is the closest UIP to the conflict. The method of choice: an asserting clause that includes the first UIP. In this case (x 10 Ç : x 4 Ç x 11). x 10=0@3 4 UIP 1 2 2 3 3 UIP 4 x 4=1@6 5 5 x 11=0@3 6 6 conflict

Conflict-driven backtracking (option #2) n Previous method: backtrack to highest decision level in conflict clause (and erase it). A better method (empirically): backtrack to the second highest decision level in the clause, without erasing it. The asserted literal is implied at that level. n In our example: (x 10 Ç : x 4 Ç x 11) n 3 n n 6 3 Previously we backtracked to decision level 6. Now we backtrack to decision level 3. x 4 = 0@3 is implied.

Conflict-driven Backtracking x 1 = 0 x 2 = 0 x 5 = 1 x 3 = 1 x 7 = 1 x 9 = 1 x 4 = 0 x 3 = 1 x 6 = 0 x 5 = 0 x 9 = 0 . . .

Conflict-Driven Backtracking n So the rule is: backtrack to the second highest decision level dl, but do not erase it. n n n If the conflict clause has a single literal, backtrack to decision level 0. Q: It seems to waste work, since it erases assignments in decision levels higher than dl, unrelated to the conflict. A: indeed. But allows the SAT solver to redirect itself with the new information.

Progress of a SAT solver work invested in refuting x=1 (some of it seems wasted) C x=1 C 5 C 2 Decision Level Refutation of x=1 C 4 BCP C 3 Time Decision Conflict

Conflict clauses and Resolution n The Binary-resolution is a sound inference rule: n Example :

Conflict clauses and resolution n Consider the following example: n Conflict clause: c 5: (x 2 Ç : x 4 Ç x 10)

Conflict clauses and resolution n Conflict clause: c 5: (x 2 Ç : x 4 Ç x 10) n Resolution order: x 4, x 5, x 6, x 7 n n n T 1 = Res(c 4, c 3, x 7) = (: x 5 Ç : x 6) T 2 = Res(T 1, c 2, x 6) = (: x 4 Ç : x 5 Ç X 10 ) T 3 = Res(T 2, c 1, x 5) = (x 2 Ç : x 4 Ç x 10 )

Finding the conflict clause: cl is asserting the first UIP Applied to our example:

The Resolution-Graph keeps track of the “inference relation” Resolution Graph 1 4 3 3 2 2 6 6 5 5 conflict 2 3 10 4 10 10 8 7 7 9 9 9 ’ conflict 5 6 11 7 8 9

The resolution graph What is it good for ? Example: for computing an Unsatisfiable core [Picture Borrowed from Zhang, Malik SAT’ 03]

Resolution graph: example Empty clause Inferred clauses learning L : Original clauses Unsatisfiable core

Decision heuristics VSIDS (Variable State Independent Decaying Sum) 1. Each variable in each polarity has a counter initialized to 0. 2. When a clause is added, the counters are updated. 3. The unassigned variable with the highest counter is chosen. 4. Periodically, all the counters are divided by a constant. (Implemented in Chaff)

Decision heuristics VSIDS (cont’d) • Chaff holds a list of unassigned variables sorted by the counter value. • Updates are needed only when adding conflict clauses. • Thus - decision is made in constant time.

Decision heuristics VSIDS (cont’d) VSIDS is a ‘quasi-static’ strategy: - static because it doesn’t depend on current assignment - dynamic because it gradually changes. Variables that appear in recent conflicts have higher priority. This strategy is a conflict-driven decision strategy. “. . employing this strategy dramatically (i. e. an order of magnitude) improved performance. . . “

Decision Heuristics - Berkmin n n Keep conflict clauses in a stack Choose the first unresolved clause in the stack n n n If there is no such clause, use VSIDS Choose from this clause a variable + value according to some scoring (e. g. VSIDS) This gives absolute priority to conflicts.

Berkmin heuristic tailfirst conflict clause

More engineering aspects of SAT solvers Observation: More than 90% of the time SAT solvers perform Deduction() allocates new implied variables and conflicts. How can this be done efficiently ?

Grasp implements Deduction() with counters Hold 2 counters for each clause : val 1( ) - # of negative literals assigned 0 in + # of positive literals assigned 1 in . val 0( ) - # of negative literals assigned 1 in + # of positive literals assigned 0 in .

Grasp implements Deduction() with counters is satisfied iff val 1( ) > 0 is unsatisfied iff val 0( ) = | | is unit iff val 1( ) = 0 val 0( ) = | | - 1 is unresolved iff val 1( ) = 0 val 0( ) < | | - 1. . Every assignment to a variable x results in updating the counters for all the clauses that contain x. Backtracking: Same complexity.

Chaff implements Deduction() with a pair of observers n n n Observation: during Deduction(), we are only interested in newly implied variables and conflicts. These occur only when the number of literals in with value ‘false’ is greater than | | - 2 Conclusion: no need to visit a clause unless (val 0( ) > | | - 2) n How can this be implemented ?

Chaff implements Deduction() with a pair of observers n Define two ‘observers’: O 1( ), O 2( ). n O 1( ) and O 2( ) point to two distinct literals which are not ‘false’. n n becomes unit if updating one observer leads to O 1( ) = O 2( ). Visit clause only if O 1( ) or O 2( ) become ‘false’.

Chaff implements Deduction() with a pair of observers O 1 O 2 V[2]=0 V[1]=0 V[5]=0, v[4]= 0 Unit clause Backtrack v[4] = v[5]= X v[1] = 1 Both observers of an implied clause are on the highest decision level present in the clause. Therefore, backtracking will un-assign them first. Conclusion: when backtracking, observers stay in place. Backtracking: No updating. Complexity = constant.

Chaff implements Deduction() with a pair of observers The choice of observing literals is important. Best strategy is - the least frequently updated variables. The observers method has a learning curve in this respect: 1. The initial observers are chosen arbitrarily. 2. The process shifts the observers away from variables that were recently updated (these variables will most probably be reassigned in a short time). In our example: the next time v[5] is updated, it will point to a significantly smaller set of clauses.

GSAT: stochastic SAT solving Given a CNF formula , choose max_tries and max_flips for i = 1 to max_tries { T : = randomly generated truth assignment for j = 1 to max_flips { if T satisfies return TRUE Many alternative heuristics choose v s. t. flipping v’s value gives largest increase in the # of satisfied clauses (break ties randomly). T : = T with v’s assignment flipped. } }

Numerous progressing heuristics n n n Hill-climbing Tabu-list Simulated-annealing Random-Walk Min-conflicts. . .

Improvement # 1: clause weights Initial weight of each clause: 1 Increase by k the weight of unsatisfied clauses. Choose v according to max increase in weight Clause weights is another example of conflict-driven decision strategy.

Improvement # 2: Averaging-in Q: Can we reuse information gathered in previous tries in order to speed up the search ? A: Yes! Rather than choosing T randomly each time, repeat ‘good assignments’ and choose randomly the rest.

Improvement # 2: Averaging-in (cont’d) Let X 1, X 2 and X 3 be equally wide bit vectors. Define a function bit_average : X 1 X 2 X 3 as follows: b 3 i b 1 i : = random b 1 i = b 2 i otherwise (where bji is the i-th bit in Xj, j {1, 2, 3})

Improvement # 2: Averaging-in (cont’d) Let Tiinit be the initial assignment (T) in cycle i. Let Tibest be the assignment with highest # of satisfied clauses in cycle i. T 1 init : = random assignment. T 2 init : = random assignment. i > 2, Tiinit : = bit_average(Ti-1 best, Ti-2 best)