Solving Random Satisfiable 3 CNF Formulas in Expected
											Solving Random Satisfiable 3 CNF Formulas in Expected Polynomial Time M. Krivelevich, D. Vilenchik SODA 2006
											Lecture Outline q What is expected polynomial time and some motivation q The planted SAT distribution and related work q Description of our algorithm q Outline of the analysis q Open problems
											Why Consider Prob. Models ? q Many interesting problems are known to be NP-hard q Hardness results only show that there exist hard instances q Should not discourage us from trying to design heuristics that work well for “almost all” instances q For rigorous analysis - define “almost all” in meaningful way q One possibility - use probabilistic models such as Gn, p
											Expected Polynomial Time q q D -a distribution on the inputs Algorithm works whp over D, if it succeeds whp when instance sampled according to D q Such algorithm may fail completely on some instances q E. g. Greedy Coloring Algorithm: q Fix the vertices in some arbitrary order q For every vertex, assign minimal possible color
											Expected Polynomial Time q Greedy uses whp at most n/logn colors for Gn, ½ [GM 75] q (Gn, ½) ~ n/2 logn whp Therefore, q Greedy yields whp 2 -approximation of (G) for G 2 Gn, ½ However, q Let G=Kn/2, n/2 minus some perfect matching q Greedy uses n/2 colors - order vertices according to matching q (G)=2 greedy fails completely
											Expected Polynomial Time Cont. Alternatively, demand success for all instances while keeping an overall average polynomial time Formally … Def. Algorithm A with running time t. A(I) on I runs in expected polynomial time over distribution D if Pr. D[I]¢t. A(I) is polynomial in n
											Expected Polynomial Time Cont. q To achieve this – separate “easy” instances (can be handled in polynomial time) from “hard” ones (rare, but may require super-polynomial time) q Requires a better understanding of the probability space q Encourages efficient, natural and more robust algorithms
											What’s Next ? q What is expected polynomial time and some motivation q The planted SAT distribution and related work. q Description of our algorithm. q Outline of the analysis. q Open problems.
											3 SAT - Definition literal 3 CNF form: clause (x 1Ç x 2 Ç ¬x 5)Æ(x 3Ǭx 4 Ǭx 1) Æ (x 1Ç x 2Ç x 6) Æ… Partial truth assignment: x 1 x 2 x 3 x 4 x 5 x 6 T F T * q 3 SAT = {all satisfiable 3 CNF formulas}. q 3 SAT is NP-complete [Cook 71].
											Different SAT Distributions q (Arguably) most natural distribution - Pn, p q Include every possible clause w. p. p=p(n) q Let = expected number of clauses / n, q Satisfiability shows sharp threshold behavior [Fri 99] q < 3. 42, almost all instances are satisfiable [KKL 02] Analog of Gn, p q > 4. 5, almost all are unsatisfiable [KKS+01] q Our focus is =d, d a sufficiently large constant
											Different SAT Distributions q Pn, p not interesting at such ratios (for satisfiability algorithms) Alternatively … q Consider distributions over satisfiable instances q One possibility, PSATn, p where PSATn, p (I) = Pn, p(I | I is sat. ) q PSATn, p is hard to sample (experimentally) q PSATn, p seems hard to tackle rigorously (no efficient algorithm known for =o(logn))
											Different SAT Distributions q Planted SAT can serve as intermediate step towards PSATn, p q It is interesting and well studied on its own right q q q It is the analog of Planted k-Coloring [BS 95], [AK 97], Planted Clique [AKS 98], [FK 00] It is a random distribution over satisfiable 3 CNF formulas with arbitrarily large clauses/variables ratio Can be efficiently sampled
											The Planted 3 SAT Distribution q Generating an instance: q Randomly pick a truth assignment q Include every clause satisfied by w. p. p=d/n 2 E. g. x 1 T x 2 F x 3 T x 4 F x 5 T (x 1Ç x 2Ç ¬x 5)Æ(x 3Ç ¬x 4Ç x 1)Æ(¬x 1Ç x 2Ç x 6)Æ… x 6 F
											Planted Distributions: Related Work q [KP 92] - greedy variables assignment, p≥d/n (Implicitly) works in expected polynomial time q [AK 97] – spectral technique for coloring sparse planted 3 -colorable graphs (np=d) q [BSBG 02] – majority vote suffices for p≥d¢logn/n 2 q [Fla 03] – techniques similar to [AK 97], solves whp planted 3 SAT, p≥d/n 2
											Related Work Cont. q q [CO 04] – SDP based expected polynomial time algorithm for (semi-random) planted k-colorable graphs, np≥d¢k¢logn [Böt 05] – SDP based expected polynomial time algorithm for planted k-colorable graphs, np≥d¢k 2
											What’s Next ? q What is expected polynomial time and some motivation q The planted SAT distribution and related work q Description of our algorithm q Outline of the analysis q Open problems
											Our Results q An algorithm that decides 3 SAT q Expected polynomial running time over planted 3 SAT, p=d/n 2 q Result extends to any constant k (in which case d=d 0 k) q First work to address the issue of expected poly. time algorithms for satisfiable SAT distributions.
											Algorithm: General Outline Most expected poly. time heuristics discard the solution and exhaustively for a The algorithm proceeds search in 2 steps: correct one correct means coincides with the planted solution 1. Find a partial correct solution containing a large fraction of variables (always poly time) 2. a. Try to complete the partial solution to a satisfying assignment Typically, all but asolution small until b. If not possible, gradually fix the partial - (d), fraction constant, e step 2. a ends up successfully (steps a+b run in expected poly. time)
											Algorithm: Basic Ingredients The Majority Vote: (x 1Çx 2Ǭx 3)Æ(x 4Ç x 2Ǭx 1)Æ(¬x 1Ç x 2Ç x 4)Æ(x 3Ǭx 2Ç x 4) x 1 x 2 x 3 x 4 F T T T
											Basic Ingredients Cont. The Unassignment Procedure: q If C = (x Ç : y Ç z)! (T Ç F), then x supports C w. r. t q Note: all three variables are assigned by E. g. unassignment with threshold t =1 (x 1Çx 2Ǭx 3)Æ(x 4Çx 2Ǭx 1)Æ(¬x 1Çx 2Ç ¬x 4)Æ(x 3Ǭx 1Ǭx 4) * Ç *F) Æ (F* Ç F * ÇF * ) Æ( *T Ç *F ) (T* Ç F* Ç *F) Æ (T Unassignment stops when all remaining variables d support at least t clauses q
											Basic Ingredients Cont. The Exhaustive Search: If every component is of size O(logn), the procedure is polynomial. q Given 3 CNF formula I, define its induced graph GI=(V, E): q V = {x 1, x 2, …, xn} - the set of variables q (xi, xj)2 E if 9 clause C containing both (polarity disregarded) q Given I, find the connected components in GI q Search every component separately for a satisfying assignment
											Basic Ingredients: Motivation q q q Assume input according to planted 3 SAT by Wrongly assigned But we alsosampled expect the Majority. majority to wrongly assign Suppose (x)=T We call such variable some variables whp wrong variable. (small fraction) In every clause, x appears w. p. 4/7¢ 3/n, : x w. p. 3/7¢ 3/n Therefore, q q Must be another wrong variable Majority Vote approximates closely whp the surviving unassignment Suppose a wrongly assigned variable survives unassignment F Ç F) (T T
											Motivation Cont. q q q W - the set of wrong variables surviving unassignment There exist at least t¢|W | clauses, each containing at least 2 variables from W We call such W dense each clause was with If |W | is small, this is analogous to small subgraph counted once, as the atypically high average degree support is unique. This happens with small probability in random graphs, Gn, p
											Algorithm: General Outline Majority Vote + Unassignment The algorithm basically proceeds in 2 steps: 1. Find a partial correct solution containing a. Exhaustive large fraction of the variables Search 2. a. Try to complete the partial solution to a satisfying assignment b. If not possible, gradually fix the partial solution until step 2. a ends up successfully. Make sure algorithm always succeeds.
											Putting Everything Together d/2 is the expected Algorithm SAT(I): support 1. MAJ Ã Majority Vote of I. 2. 3. 4. 5. 6. completeness Carry unassignment with threshold 0. 999 d/2 soundness w. r. t MAJ. Let be the partial assignment. Let U be the set of unassigned variables. Construct G=(U, E). For all subsets Y µ VU, |Y|=0. . |VU|, and for all possible assignments Y of Y: 1. Fix according to Y. 2. Using exhaustive search on G(U, E) try to complete to a satisfying. Y assignment. is the fixing set of variables 3. If success, return the assignment.
											What’s Next ? q What is expected polynomial time and some motivation. q The planted SAT distribution and related work. q Description of our algorithm. q Outline of the analysis. q Open problems.
											Analyzing the Running Time Algorithm SAT(I): 1. MAJ Ã Majority Vote of I. 2. Carry unassignment with threshold Expected to 0. 999 d/2 perform Expected running time w. r. t MAJ. O(1) times O(n 1+ ) 3. Let be the partial assignment. 4. Let U be the set of unassigned variables. 5. Construct G=(U, E). 6. For all subsets Y µ VU, and for all possible assignments Y of Y: Always polynomial. 1. Fix according to Y. In fact expected linear time 2. Using exhaustive search on G(U, E) try to complete to a satisfying assignment. 3. If success, return the assignment.
											Analysis Outline Typically (for Planted 3 SAT), the following happens: q arguments Distance between MAJ and thesimilar planted assignment is e- (d)n to Gn, p, np<1 q Almost all correct variables, (1 -e- (d) ) n, survive unassignment q Only correct variables survive the unassignment q G=(U, E) breaks down to O(logn)-size connected components Therefore, q “Density” arguments Exhaustive search is successful and polynomial
											Analysis Outline q What can go wrong, preventing successful execution ? q Wrong variables survived the unassignment: q The partial assignment induces a (FÇFÇF) clause q Formula induced by unassigned variables is not satisfiable q Y 0 - the set of fixing variables with which the algorithm ends q Typically, Y 0=;
											Analysis Outline Cont. Key observation: if Y 0 ; then: 1. The Majority Vote is wrong for at least |Y 0| variables 2. Y 0 is a dense set of variables q For “large” |Y 0|, (1) happens with small probability q q Suppose x 2 Y 0 ! For “small” |Y 0|, (2) happens with small probability x survives the unassignment ! Otherwise, the algorithm x supports ~d/2 have clauses ! with a would ended It remains to carry out the exact calculations x smaller yset Y’ Y. 0 F (T Ç F) T ! y 2 Y 0, otherwise, algorithm can not end
											A Taste of Rigorous Analysis The following properties hold whp for Planted 3 SAT: q Let 0=e-d/C 0 q FMAJ - the set of variables on which MAJ and disagree q Claim: for y ¸ 0 n, Pr[|FMAJ|¸ y] · e-yd/C 1 q q q For JµV, F(J) is the set of clauses in I containing at least 2 variables from J Claim: Pr[9 J, |J|· 0 n, |F(J)|¸|J|d/3]· e-|J|log(n/|J|)d/12 Properties proved using standard probabilistic techniques (union bound, Chernoff)
											A Taste of Rigorous Analysis The expected number of fixing iterations is at most:
											A Taste of Rigorous Analysis
											Open Problems q [FV 04] show a k-opt based heuristic solving whp Planted 3 SAT, p=d/n 2 q q Change k-opt version to run in expected polynomial time Challenge: no explicit distinction between wrong and correct variables Simplify [Böt 05], e. g. replacing SDP approximation with simpler and stronger procedure (similar to Majority Vote) Design an efficient algorithm for random (not planted) satisfiable formulas, p=d/n 2
											- Slides: 35