Giorgi Japaridze Theory of Computability NPcompleteness Section 7

Giorgi Japaridze Theory of Computability NP-completeness Section 7. 4

7. 4. a Giorgi Japaridze Theory of Computability Importance NP-complete problems form a certain important subclass of NP. The phenomenon of NP-completeness was discovered in the early 1970 s by Stephen Cook and Leonid Levin. • If a polynomial time algorithm exists for any of the NP-complete problems, all problems in NP would be polynomial time solvable. • To prove that P=NP, it would be sufficient to take any particular NP-complete problem A and show that A P. • To prove that P≠NP, it would be sufficient to take any particular NP-complete problem A and show that A P. • On the practical side, finding that a given problem A is NP-complete may prevent wasting time looking for a (probably nonexistent, or unlikely-to-be-found even if exists) polynomial time algorithm for A.

7. 4. b Giorgi Japaridze Theory of Computability Boolean formulas Boolean variables x, y, … take one of the two values 0 (false) or 1 (true). Boolean operations: (NOT), (AND), (OR). We write A for A. Boolean formulas are constructed from variables and operations in the standard way. Once a truth assignment for variables is given, the value of a compound formula is calculated as follows: 0=1 1=0 0 0=0 0 1=0 1 0=0 1 1=1 If x=0 and y=1, what is the value of the following formula? (y (x y)) (x y) 0 0=0 0 1=1 1 0=1 1 1=1 (1 (0 1)) (0 1) (1 0)) (0 0) (1 1 1 ) 1 0 0

7. 4. c Giorgi Japaridze Theory of Computability The SAT problem We say that a Boolean formula is satisfiable iff there is an assignment of 0 s and 1 s to its variables that makes the formula evaluate to 1. Are the following formulas satisfiable? x (x y) Yes. An (in fact, the) satisfying assignment: x=0, y=1 No SAT = {< > | is a satisfiable Boolean formula} SAT P? SAT NP? Nobody knows! Yes. A satisying assignment can serve as a membership certificate

7. 4. d Giorgi Japaridze Theory of Computability Polynomial time reducibility Definition 7. 28 A polynomial time computable function is a function computed by some polynomial time TMO. Definition 7. 29 Let A and B be languages over an alphabet . A polynomial time reduction of A to B is a polynomial time computable function f: * * s. t. , for every string w *, w A iff f(w) B. When such an f exists, we say that A is polynomial time reducible to B, written A PB. Theorem 7. 31 If A PB and B P, then A P. Proof. Assume M is a polynomial time decider for B, and f is a polynomial time reduction from A to B. The following is a polynomial time algorithm deciding A: N = “On input w: 1. Compute f(w). 2. Run M on input f(w) and do (accept or reject) whatever M does. ”

7. 4. e Giorgi Japaridze Theory of Computability The 3 SAT problem • A literal is a Boolean variable x or a negated Boolean variable x. • A clause is several literals connected with s, as in (x y z t). • A Boolean formula is in conjunctive normal form, called a cnf-formula, if it comprises several clauses connected with s, as in (x y z t) (x z) (x y t) • A cnf-formula is a 3 cnf-formula if all the clauses have 3 literals, as in (x y z) (x z t) (x y t) (z y t) • 3 SAT = {< > | is a satisfiable 3 cnf-formula} 3 SAT P? 3 SAT NP? Nobody knows! Yes

7. 4. f Giorgi Japaridze Theory of Computability Reducing 3 SAT to CLIQUE (a) Theorem 7. 32 3 SAT is polynomial time reducible to CLIQUE. Proof. Let be a 3 cnf-formula with k clauses such as = (a 1 b 1 c 1) (a 2 b 2 c 2) … (ak bk ck) Our reduction f is going to generate the string <G, k>, where G is an undirected graph defined as follows. The nodes of G are organized into k groups of three nodes each called the triples, t 1, …, tk. Each triple corresponds to one of the clauses in , and each node in a triple corresponds to a literal in the associated clause. Label each node of G with its corresponding literal in . The edges of G connect all but two types of pairs of nodes: (1) no edge is present between two nodes in the same triple, and (2) No edge is present between nodes with contradictory labels, as in x and x.

7. 4. g Giorgi Japaridze Theory of Computability Reducing 3 SAT to CLIQUE (b) = (x x z) (x z z) For instance, if is as above, then G would be x z z x x x z z z Obviously transforming into G takes polynomial time. Next we argue that (slide 7. 4. h) if 3 SAT, then <G, k> CLIQUE, and that (slide 7. 4. i) if <G, k> CLIQUE, then 3 SAT. So, we indeed have a polynomial time reduction.

7. 4. h Giorgi Japaridze Theory of Computability Reducing 3 SAT to CLIQUE (c) x z z x x x z z z = (x x z) (x z z) Suppose has a satisfying assignment. Then at least one literal should be true in each clause. Select one such literal in each clause, and select the corresponding nodes in the graph. Those nodes form a k-clique! Because there are k such nodes, and each pair is connected by an edge because they are in different triples and non-contradictory.

7. 4. i Giorgi Japaridze Theory of Computability Reducing 3 SAT to CLIQUE (d) x z z x x x z z z x= 0 z= 1 = (x x z) (x z z) Now suppose G has a k-clique. Each of its nodes should be in different triples as there are no edges within triples. So, each triple has exactly one node of the clique. Select the corresponding literals in , and select an assignment that makes each such literal true. This is possible (why? ). Then the same assignment makes true.

7. 4. j Giorgi Japaridze Theory of Computability Definition of NP-completeness Definition 7. 34 A language B is NP-complete if it satisfies two conditions: 1. B is in NP, and 2. every language in NP is polynomial time reducible to B. Theorem 7. 35 If a language B is NP-complete and B P, then P=NP. Proof. Immediately from the above clause 2 and Theorem 7. 31. Theorem 7. 36 If C NP, B is NP-complete and B PC, then C is NP-complete. Proof. We already know that C NP, so we only need to show that A PC for every A NP. Consider any A NP. We must have A PB, because B is NP-complete. Let f be a polynomial time reduction from A to B. Next, we know that B PC, so let g be a polynomial time reduction from B to C. Let now h be the composition of f and g, that is, h(w) = g(f(w)). It is easy to see that h is a polynomial time reduction from A to C. So, indeed A PC.

7. 4. k Giorgi Japaridze Theory of Computability Cook-Levin theorem: Getting started Theorem 7. 37 (Cook-Levin Theorem) SAT is NP-complete. Proof. That SAT NP is obvious (why? ). So we only need to show that every language A from NP is polynomial time reducible to SAT. Pick an arbitrary A NP, and let N be an NTM deciding A. We assume that the running time of N is nk (to be more accurate, nk-3). We are going to show to turn a string w into a Boolean formula that “simulates” N on input w in the sense that is satisfiable iff N accepts w.

7. 4. l Giorgi Japaridze Theory of Computability Cook-Levin theorem: Tableaus A tableau for N on w is an nk nk table whose rows are the configurations of a(ny) computation branch of N on input w=w 1…wn. The 1 st and last columns contain #s. # q 0 w 1 w 2 … wn - - … - # 1 st (start) configuration # # 2 nd configuration 3 rd configuration nk window # # nkth configuration nk We say that a tableau is accepting if any of its rows is an accepting configuration. Every accepting tableau for N on w corresponds to an accepting computation branch of N on input w. Thus the problem of determining whether N accepts w is equivalent to determining whethere is an accepting tableau for N on w.

7. 4. m Cook-Levin theorem: # q 0 w 1 w 2 w 3 # w 1 q 7 w 2 w 3 # q 5 w 1 $ w 3 … … … wn wn wn - - … … … - Giorgi Japaridze Theory of Computability # # # We denote by C the set Q {#}, where Q is the set of states of N and is the tape alphabet. C is thus the set of all possible contents of the cells of the tableau. The cell in row i and column j is called cell[i, j]. For each such cell and each s C, we create a Boolean variable xi, j, s. Its meaning is going to be “cell[i, j] contains symbol s”. Our formula is going to be built from those variables, and we have = cell start move accept cell asserts that each cell contains exactly one symbol start asserts that the first row is the start configuration on input w move asserts that rows are related to each other in accordance with the transition function accept asserts that one of the cells contains the accept state thus asserts that the tableau is for an accepting computation branch, i. e. that w A.

7. 4. n Cook-Levin theorem: cell Giorgi Japaridze Theory of Computability cell asserts that each cell contains exactly one symbol “cell[i, j] contains symbol s” = xi, j, s “cell[i, j] contains at least one symbol” = ( x ) “cell[i, j] does not contain both s and t” = (xi, j, s xi, j, t) “cell[i, j] contains at most one symbol” = “each cell 1 i, j nk s C i, j, s ( (x s, t C s≠t “cell[i, j] contains exactly one symbol” [ ( xi, j, s s C ) ( (x s, t C s≠t cell i, j, s xi, j, t) )] i, j, s xi, j, t) )

7. 4. o Giorgi Japaridze Cook-Levin theorem: start Theory of Computability start asserts that the first row is the start configuration on input w # q 0 w 1 w 2 … wn - - … - # 1 st (start) configuration “cell[1, 1] contains #” = x 1, 1, # “cell[1, 2] contains q 0” = x 1, 2, q 0 … start = x 1, 1, # x 1, 2, q 0 x 1, 3, w 1 x 1, 4, w 2 … x 1, n+2, wn x 1, n+3, - … x 1, nk-1, - x 1, nk, #

7. 4. p Cook-Levin theorem: accept Giorgi Japaridze Theory of Computability accept asserts that one of the cells contains the accept state “cell[i, j] contains qaccept” = accept = 1 i, j nk xi, j, qaccept

7. 4. q Giorgi Japaridze Theory of Computability Cook-Levin theorem: Windows 1 2 3 … j … 1 2 3 . . . i . . . a 1 a 2 a 3 the (i, j) window a 4 a 5 a 6 We say that (the content of) a window is legal if it(s content) could appear in some (legal) tableau for N.

7. 4. r Giorgi Japaridze Theory of Computability Cook-Levin theorem: Examples of legal and illegal windows Where is the transition function of N, assume we have (q 1, a) = {(q 1, b, R)} and (q 1, b) = {(q 2, c, L), (q 2, a, R)}. Are the following windows legal or illegal? a q 1 b a a q 1 # b a a b b b q 2 a a a b # b a a b q 2 c b b a c Yes a q 2 Yes a b a a q 1 b b q 1 b a a a q 1 q 2 No a a No b q 2 No Yes Yes

7. 4. s Giorgi Japaridze Theory of Computability Cook-Levin theorem: Claim about windows Claim 7. 41 If the top row of the tableau is the start configuration and every window in the tableau is legal, then each row in the tableau is a configuration that legally follows the preceding one. Proof. Consider any two adjacent rows (configurations). In the upper configuration, every cell that isn’t adjacent to a state symbol and doesn’t contain the boundary symbol #, is the center top cell in a window whose top row contains no states. Therefore that symbol, in a legal window, must appear unchanged in the center bottom of the window. Hence it appears (as it should) in the same position in the bottom configuration. The window containing the state symbol in the center top cell guarantees that the corresponding three positions are updated consistently with the transition function. Therefore, if the upper configuration is a legal configuration, so is the lower configuration, and the lower one follows the upper one according to N’s rules. x b y ? b ? q 1

7. 4. t Cook-Levin theorem: move Giorgi Japaridze Theory of Computability move asserts that rows are related to each other in accordance with the transition function Let us say that a 6 -tuple (a 1, a 2, a 3, a 4, a 5, a 6) of symbols from C is legal if the window on the right is legal. Notice that the number of legal 6 -tuples is fixed and it does not depend on w. 6 BTW, at most how many legal 6 -tuples could exist? |C| a 1 a 2 a 3 a 4 a 5 a 6 “the content of the (i, j) window is (a 1, …, a 6)” (x (a , …, a ) i, j-1, a 1 1 6 xi, j, a 2 xi, j+1, a 3 xi+1, j-1, a 4 xi+1, j, a 5 xi+1, j+1, a 6 ) is legal “the (i, j) window is legal” move = (the (i, j) window is legal) 1 i n -1 k 2 j nk-1

7. 4. u Giorgi Japaridze Theory of Computability Cook-Levin theorem: The complexity of the reduction Our reduction does nothing but builds , so its time complexity is asymptotically the same as the size of . We want to see this size is polynomial in n. For this, in turn, it would be sufficient to verify the polynomiality (in n) of the four conjuncts of . What is the size of start? O(nk) x 1, 1, # x 1, 2, q 0 start = x 1, 3, w 1 x 1, 4, w 2 … x 1, n+2, wn x 1, n+3, - x 1, nk-1, - x 1, nk, #

7. 4. u Giorgi Japaridze Theory of Computability Cook-Levin theorem: The complexity of the reduction Our reduction does nothing but builds , so its time complexity is asymptotically the same as the size of . We want to see this size is polynomial in n. For this, in turn, it would be sufficient to verify the polynomiality (in n) of the four conjuncts of . What is the size of start? O(nk) What is the size of accept? O(n 2 k) accept = 1 i, j nk xi, j, qaccept

7. 4. u Giorgi Japaridze Theory of Computability Cook-Levin theorem: The complexity of the reduction Our reduction does nothing but builds , so its time complexity is asymptotically the same as the size of . We want to see this size is polynomial in n. For this, in turn, it would be sufficient to verify the polynomiality (in n) of the four conjuncts of . What is the size of start? O(nk) What is the size of accept? O(n 2 k) What is the size of cell? O(n 2 k) cell = 1 i, j n k [ ( xi, j, s s C ) ( (x s, t C s≠t i, j, s xi, j, t) )]

7. 4. u Giorgi Japaridze Theory of Computability Cook-Levin theorem: The complexity of the reduction Our reduction does nothing but builds , so its time complexity is asymptotically the same as the size of . We want to see this size is polynomial in n. For this, in turn, it would be sufficient to verify the polynomiality (in n) of the four conjuncts of . What is the size of start? O(nk) What is the size of accept? O(n 2 k) What is the size of cell? O(n 2 k) What is the size of move? O(n 2 k) (a , …, a ) 1 6 is legal move = 1 i n -1 k 2 j nk-1 (xi, j-1, a 1 xi, j, a 2 xi, j+1, a 3 xi+1, j-1, a 4 xi+1, j, a 5 xi+1, j+1, a 6 )

7. 4. u Giorgi Japaridze Theory of Computability Cook-Levin theorem: The complexity of the reduction Our reduction does nothing but builds , so its time complexity is asymptotically the same as the size of . We want to see this size is polynomial in n. For this, in turn, it would be sufficient to verify the polynomiality (in n) of the four conjuncts of . What is the size of start? O(nk) What is the size of accept? O(n 2 k) What is the size of cell? O(n 2 k) What is the size of move? O(n 2 k) The complexity of our reduction is thus O(n 2 k), i. e. polynomial.

7. 4. v Giorgi Japaridze Theory of Computability Cook-Levin theorem: Wrapping it up It remains to understand why our reduction is indeed a reduction, i. e. , why it is the case that w A iff is satisfiable. ( ) Assume w A. Then N accepts w, i. e. there is an accepting computation branch of N on input w. Construct a tableau that describes such a branch. Then declare each variable xi, j, s to be true iff, in that tableau, cell[i, j] contains symbol s. Obviously this truth assignment satisfies . ( ) Assume is satisfiable, i. e. there is an assignment that makes true. Fix it. Construct a tableau by putting symbol s in cell[i, j] iff xi, j, s is true (the truth of cell guarantees that such a construction is possible and unique). Obviously such a tableau describes a certain accepting computation branch of N on input w. Thus, N accepts w, meaning that w A.

7. 4. w Giorgi Japaridze Theory of Computability The NP-completeness of 3 SAT Corollary 7. 42 3 SAT is NP-complete. Proof. From logic, a polynomial-time-computable function f: {< > | is a(ny) Boolean formula} {< > | is a 3 cnf-formula} is known such that, for any Boolean formula , we have is satisfiable iff f( ) is satisfiable. Thus, f is a polynomial time reduction from SAT to 3 SAT. Hence, in view of the already known NP-completeness of SAT together with Theorem 7. 36, we find that 3 SAT is NP-complete. The book gives a slightly different and full proof of this result.