Output Sensitive Enumeration 2 Basic Algorithms Divide and
Output Sensitive Enumeration 2. Basic Algorithms • • • Divide and Conquer Backtracking for Maximals Binary Partition (st-path, matching) Seeing Difficulty on Binary Partition
2 -1 Branch and Bound
Brute-force Algorithm • Brute force is acceptable if the problem is easy (small) how do we do “brute force”? + enumerate all candidates and output the solutions among them + enlarge the solutions one by one and remove isomorphic ones + scan the candidates from the smaller ones • We should have “simple algorithms” for solving easy problems • … and also “simplementing ways”, such as, only making a subroutine (oracle) for computing a function is necessary
Divide and Conquer • Determine the value of variables one by one • Recursive call for each value • At the bottom of recursion, an assignment is determined v 1, v 2 ¬v 1 v 1, ¬v 2 ¬v 1, ¬v 2
Divide and Conquer • Prune the brunch when we confirm non-existence of solution in descendants accuracy and speed are the key If the pruning is exact, and is polynomially done, polynomial time delay v 1, v 2 v 1 ¬v 1 v 1, ¬v 2 ¬v 1, ¬v 2
Extension Problem • Formally, to prune the branch (recursive call) we have to solve the following extension problem Extension Problem: For an enumeration problem, sets S (to be included), and X (to be excluded), determine whethere is a solution Z to the enumeration problem s. t. S⊆Z and Z ∩ X = ∅. If this problem can be solved efficiently, the branch and bound algorithm works well v 1, ¬v 2
(1) Enumerate Combinations • Enum. all combinations determine variable values recursively Enum 1 (X : set of all determined values, i: index) 1. if no solution includes X then return 2. if i > maximum index then 3. if X is a solution then output X Extension problem 4. else 5. for each e in values which xi can take 6. call Enum 1 (X∪(xi =e), i+1) 7. end for • Only “ 3. check of being a solution” is needed. 1. is not necessary • Fast if check in 1 is of high accuracy
(2): Enumerate Patterns • To avoid isomorphic solutions, incremental generation (for graphs, matrix, sequences, …) Global variable: database D : = φ Enum 2 (X: pattern) 1. insert X to D 2. if no solution includes X then return 3. if X is a solution then output X 4. for each X’ obtained by adding an element to X 5. if none of D is isomorphic to X’ then call Enum 2 (X’) • Only designs of 3. and 4. are necessary • Efficient if check in 2. is fast and of high accuracy
Basic Enumeration Algorithms • Since fundamental, construction scheme is also simple • On the other hand, not so many variations + backtracking depth-first search with lexicographic ordering + binary partition branch & bound like recursive partition algorithm + reverse search on traversal tree defined by parent-child relation
2 -2 Backtracking
Backtracking • Consider a monotone set (independent) system, for example cliques 1 Clique: a subgraph in which any vertex pair is connected by an edge 3 7 A clique is found by iteratively adding vertices, with passing through only cliques 5 4 6 8 9 12 2 11 Ex. 2 6 4 8 So, starting from the empty set, and iteratively adding vertices, every time we can find a clique 10
Duplication • By naively adding vertices, we generate a clique several times 2 6 4 8 2 6 8 4 4 6 2 8 … 1 3 7 5 4 6 8 9 12 2 11 Some idea to avoid duplications is necessary Define a rule; add only vertices whose indies are larger than any of the clique Then, duplication never happens 2 4 6 8 10
Why OK? add only vertices whose indies are larger than any of the clique Suppose that a clique is generated in two ways 1 2 4 6 8 2 6 4 8 6 8 11 Then, at least one of them has a pair of vertex addition, such that The previous one is larger than the latter It contradicts the addition rule 10
General Backtracking • Mainly used for independent (monotone) sets (maximals) Independent set system F : X∈F for any X’⊆X , X'∈F ( X∈F any subset of X is a member of F) Ex) + cliques of a graph, matchings, combinations of numbers whose sum is less than b, frequent itemsets… 111… 1 × Not + trees of a graph, paths, cycles, … 000… 0
Framework of Backtracking • Start from the empty set, and recursively add elements • In each iteration, add only elements larger than the current maximum element cliques 000… 0 (an iteration does not include those in its recursive calls) • Recursive call with the result of addition, if it is a solution • Go back after all examinations 111… 1 1, 2, 3, 4 1, 2, 3 1, 2, 4 1, 3 1 1, 3, 4 1, 4 2, 3 2, 4 3, 4 2 3 4 ∅
Pseudo Code for Backtracking • Start from the empty set, and recursively add elements; add only elements larger than the current maximum element 1, 2, 3, 4 Backtrack (S) 1, 2, 3 1. output S 2. for each e > tail of S 1, 2 (the max. element in S) 3. if S∪{e} is a solution then call Backtrack (S∪{e}) 4. end for 1, 2, 4 1, 3, 4 1 • simple, and polynomial space • polynomial delay (output polynomial time) 2, 3, 4 2, 3 2, 4 3, 4 2 3 4 ∅
Feasible Solutions to Knapsack Problem …folklore Problem: enumerate all subsets of a 1, …, an whose sum is less than b Feasible. Knapsack (S) 1. output S 2. for each i > tail of S (maximum element in S) 3. if ∑S + ai < b then call Feasible. Knapsack (S∪{ai}) 4. end for Computation time: each iteration outputs a solution, and take O(n) time per solution is O(n) • Sort a 1, …, an, then each recursive call can be generate in O(1) time an iteration O(#recursive calls) O(1) time per solution
Code for Knapsack • Print all combinations of a[0], …, a[n] with summation less than b int a[n], flag[n]; sub (int i, int s){ int j; for (j=0 ; j<n ; j++) if (flag[j] = = 1) printf (“%dn”, a[j]); // print a solution for (j=i+1 ; j<n ; j++) if (s+a[j] <= b){ // check the feasibility flag[j] = 1; sub (i, s+a[j]); flag[j] = 0; } } }
Simple Case • There are several problem in that we don’t need to take care the duplications Problem: for given a graph and vertex s, enumerate all paths starting from s Ex. 1 2 1 5 9 1 3 9 11 4 8 2 6 In each iteration, we generate a recursive 7 call for each neighboring vertex 1 3 5 4 6 8 9 12 2 11 10
Algorithm for Paths Starting at s …folklore enumerate all paths of G starting at s Path. Starting. S (P, s) 1. output P 2. A : = set of vertices adjacent to s 3. delete s from the graph 4. for each v in A, call Path. Starting. S (P∪{v}, v) 5. recover s to the graph Computation time: O(n) time per iteration O(n) time per solution
2 -3 Maximal by Backtracking
Maximal Solutions • # of solutions increases exponentially when n or the sizes of solutions are large • If # of solutions is large, post-process is also hard enumerate maximal so that the solution set is irredundant X∈F is maximal in F for any X⊆X’, X’∈F does not hold • Maximal solutions are not neighboring to each other, efficient search is hopeless (exception; spanning trees, matroid bases) 111… 1 000… 0
Straightforward Method • Backtracking enumerates all solutions, so we can enumerate all maximals by just outputting a solution only when it is maximal it is inefficient… • If we can prune the recursive call that never output any maximal, the algorithm will be output polynomial time In general, we want to solve the following problem 111… 1 Maximal Extension: For partial solution S and set X of deleted elements, is there a maximal solution Z s. t. , S⊆Z and Z ∩ X = ∅ ? … this problem is hard in general 000… 0
Knapsack Maximals …folklore Fortunately, maximal extension can be solved for knapsack problem Problem: enumerate all maximal subsets of a 1, …, an whose sum is no greater than b • Put indices to a 1, …, an in decreasing order Maximal. Knapsack (S) Pruning that with only 1. output S non-maximal solutions 2. for each i > tail of S and ∑S + ai +…+ an > b – ai-1 3. if ∑S + ai ≦b then call Maximal. Knapsack (S∪{ai}) 4. end for Computation time: An iteration takes O(n) O(n 2) time per solution
Partial Pruning • Exact pruning is usually difficult, so we often use partial pruning; here partial means that in only partial cases we can prune the recursive calls • The pruning methods are usually based on some structures or property of the target problem So, there are less methods work in general 1, 2, 3, 4 1, 2, 3 1, 2 • One of these uses elementreordering in each iteration 1, 2, 4 1, 3 1 1, 3, 4 1, 4 2, 3 2, 4 3, 4 2 3 4 ∅
Maximal: Shift a Solution to the End …folklore • Maximal enumeration admits a simple pruning algorithm (1) prune if meets a non-member (2) no brunch needed if addition of all remaining members is a member • Even if (1) is complete, exhaust search for all members is inefficient • Find a maximal solution, shift all its element to the bottom, then no need of recursive calls for the shifted elements because (2) works for the elements! element ordering For small maximal solution sizes (up to 30), practically efficient
Pseudo Code • Describe the algorithm by a pseudo code Feed. Tail. Maximal (P: current solution, I: undetermined elements) 1. find maximal set S among those including P and included in P∪I 2. if S is a maximal solution of the problem then output S 3. for each e∈I\S I : = I\{e}; {e} call Feed. Tail. Maximal (P∪e, I) element ordering P I
Specialized to Clique Tomita et. al. 2006 • For cliques, we can have bigger pruning 1 Observation: 3 after a recursive call with respect to 7 P∪e, the cliques composed of P and the neighbor of e never be maximal 12 5 P e 4 6 8 9 By feeding the neighbor of e to the last, recursive calls for the fed vertices become unnecessary 2 11 Reorder this area neighbor of e Practically very good (constant time for each maximal clique) 10
2 -4 Binary Partition for st-path
Binary Partition • X is a set of solutions, that is a subset (subsequence, etc. ) of F satisfying a property P F 1 • Binary partition outputs the solution if solution in F X is unique • Otherwise, it partitions F into two (or several) sets so that X is partitioned into non-empty sets F 2 • Do this recursively, until the solution is unique X 1 X X 2 Ex. ) + paths of a graph connecting vertex s and vertex t (st-paths) + perfect matchings of a bipartite graph + spanning trees of a graph + connected components of a graph
Time Complexity • Binary partition always partitions a problem or outputs a solution #iteration is bounded by 2 N • The partition process is polynomial time, (determine how to partition, and check empty or not) the algorithm is output polynomial time • • •
Time Complexity • If the height of the tree is polynomial in n, it is polynomial delay (to go up (go back) from the leaf to the root, O(height) time is needed) • If the partition process needs polynomial space, the algorithm is polynomial space • • •
Binary Partition of st-paths Problem: enumerate all st-paths in G = (V, E) Read&Tarjan ’ 75 …modified by U Partition: choose an edge e incident to s, and partition into + enumeration of st-paths including e + enumeration of st-paths not including e so that both problems are non-empty (it is also called pivoting and e is called pivot, or pivoting edge) Child Problems: st-paths including e: remove all edges incident to s except e st-paths not including e: remove e
Child Problems on st-paths Child Problems: st-paths including e: remove all edges incident to s (and move s to the next vertex) denote G-s st-paths not including e: remove e denote G-e s s s t t s Computation time: one iteration = O(|E|) t t
Choosing Valid Edge • If we choose a bad edge, the subproblems will be empty; + “including e” is empty, if t is not reachable via e remove the component including e + “not including e” is empty, if e is the only edge reachable to t move s to the next vertex , and remove e • After at most |E| repetitions, we can always find a valid edge s t
Time Complexity • Test of the validity of the edge takes O(|V|) time at most O(|E|) repetitions • An iteration takes O(|E||E|) time s • Since #iterations < 2 N, time per solution is O(|E||E|) • Since the height of the recursion tree is O(|V|), the delay O(|V||E|2) t
Pseudo Code for st-paths Enum_st-path (G, s, t, S) 1. if s = t then output S, return 2. choose an edge e = (s, v) 3. if no vt-path in G-s then remove e, go to 1. 4. if no st-path in G-e then remove e, S : = S+s, s : = v, go to 1. 5. call Enum_st-path (G-s, v, t, S) 6. call Enum_st-path (G-e, s, t, S) s s t t
Better Algorithm • How long does it take (graph reform) to find a valid edge? • Find a path P from s to t • Choose an edge e = (s, v) incident to s and not in P + t is not reachable via e delete the visited edges O(#delete edges) + only one edge (in P) is incident to s move s to v, and remove e O(1) v • Computation time is O(#delete edges), until we find a valid edge, i. e. , O(|E|) s t
Pseudo Program Code • flag[] : =0 in initialization, path is the current solution int mark[m], path[n]; enum_path (int s, int i){ if (s = t){ output path[0], …, path[i]; return } • find an st-path, f (=(s, v)) : = the edges in the path incident to s • mark[f]: = 1 (put mark) while (1){ • choose an edge e=(s, v) s. t. mark[e] = 0 • mark[e] : = 1 • if (no such edge e exist){ path[i] : = v; i++; s : = v if (s = t){ output path[0], …, path[i]; return } } else if ( t is reachable from v via only unmarked edges and not through s ){ break } } call enum_path (s, i); path[i] : = v; call enum_path (v, i+1); • set mark[e]: = 0 for edges e marked in this iteration }
2 -5 Binary Partition General Scheme
A Simple Description Binary. Partition (E, S, X) • Binary partition divides 1. while E ≠ S∪X the problem into two 2. choose e∈E\(S∪X) We often partition by a 3. solve extension problem for S∪e, X variable/vertex/edge 4. solve extension problem for S, X∪e +enumerate all including x, 5. if yes for both problems and all not including x call Binary. Partition (E, S∪e, X) call Binary. Partition (E, S, X∪e) In this setting, we can return generalize the algorithm 6. else if yes for 3 then S = S∪e as the right 7. else X = X∪e 8. end while O(|E|T(Ext)) time for each 9. if E = S∪X then output S, return
Using Certificate Binary. Partition (E, S, X) • When the extension problem gives a solution C 1. while E ≠ S∪X as a certificate, we can use it 2. choose e∈E\(S∪X) to efficiently choose the 3. solve extension problem for S∪e, X pivot 4. solve extension problem for S, X∪e • If there is another solution 5. if yes for both problems to be enumerated, at least call Binary. Partition (E, S∪e, X) one element of C acts as a call Binary. Partition (E, S, X∪e) pivot return another solution doesn’t 6. else if yes for 3 then S = S∪e include at least one of C 7. else X = X∪e O(|C|T(Ext)) time for each 8. end while 9. if E = S∪X then output S, return
With Strong Extension • If we can solve “another solution problem”, which is to find a solution different from a given solution, we can do better Any edge in the symmetric difference can be a pivot O(T(Ano. S)) time for each Binary. Partition (E, S, X, T) 1. output T 2. find a solution T’ ≠ T s. t. S ⊆ T’, T’∩X = ∅ 3. if such T’ does not exists return 4. choose e∈ T’ △ T (w. l. o. g. assume e∈ T) 5. call Binary. Partition (E, S∪e, X, T) 6. call Binary. Partition (E, S, X∪e, T’)
2 -5 Binary Partition Perfect Matching
Bipartite Perfect Matching • A matching is an edge set such that no two edges in the set have their endpoints on the same vertex • A matching is perfect if it covers all the vertices (any vertex is incident to an edge of the matching) • For given a bipartite graph G=(V, E), its matching is called a bipartite matching • We want to enumerate all perfect matchings in G
Bipartite Perfect Matching • For an edge e of the graph, the set of perfect matchings not including e is the set of perfect matchings in G\e (obtained by removing e) • For an edge e of the graph, the set of perfect matchings including e is the set of perfect matchings in the graph obtained by removing all edges adjacent to e • A perfect matching in a bipartite graph can be found in O(|V|1/2|E|) time • Combining these, we can obtain an output linear time algorithm of O(|V|1/2|E|2) time
Bipartite Perfect Matching • For bipartite perfect matching, we can solve the another solution problem in O(|E|) time • The symmetric difference of any two perfect matching is composed only of cycles In the cycle edges of one matching and edges of the other appear alternatively ↑ such cycle is called an alternating cycle • On the other hand, for a matching, a cycle is called alternating if matching edges and non-matching edges appear in the cycle alternatively
Existence of Another Solution • If there is another solution, there is always an alternating cycle for the perfect matching By exchanging the edges along an alternating cycle, we can have another perfect matching • Then, how to find an alternating cycle? • Orient edges from left to right for matching edges, and the opposite to the others … then directed cycles and alternating cycles correspond one to one and thus can be found in O(|E|) time
2 -6 Seeing Difficulty of Binary Partition Algorithms
Why Difficulty • It is of course important to study on developing efficient binary search algorithms and, also seeking good problems, admit poly-time algorithms • However, for given problems, usually this direction is hard we cannot find any good algorithm then, we naturally want to know “why it is difficult” • We don’t have any NP-completeness like complexity tool dualization hard is such a kind, but restricted • Another way is to state that simple algorithm never work
Subproblem in the Same Analogy • We often partition the problem into solutions including e, and solution not including e (e is called pivot) Easy case: both subproblem can be formulated in the same analogy as the original problem X 1 ex) s-t path enumeration s’-t path enumeration F 1 ex) knapsack of set X-a F Hard case: one is not formulated in the same way F 2 + enumerate maximal cliques including v (the same in the induced subgraph of N(v)) + enumerate maximal cliques not including v (not formulated as clique enumeration) X X 2
Extension Problem • In general, when we choose a vertex/element and partition the problem by inclusion/exclusion, the problem we have to solve is an extension problem Extension Problem: For an enumeration problem, sets S (to be included), and X (to be excluded), determine whethere is a solution Z to the enumeration problem s. t. S⊆Z and Z ∩ X = ∅. Even though the problem is hard in general, there might be polynomial time binary partition algorithm v 1, ¬v 2
Extension of st-path • Consider the extension problem on st-path Extension Problem for st-path: For a graph G= (V, E), vertices s and t, vertex sets S and X, determine whethere is an st-path passing through all edges in S but none in X Actually, it is known to be NP-complete (even in the case X=∅) Hamiltonian path problem can be reduced v ¬v 1, However, by carefully choosing the pivoting edge, we can have polynomial time delay 2
Extension of Maximal Clique • Consider the extension problem on maximal clique Extension Problem for maximal clique: For a graph G= (V, E), vertex sets S and X, determine whethere is a maximal clique Z s. t. S⊆Z and Z ∩ X = ∅. Actually, it is known to be NP-complete Straightforward binary partition is difficult to design Different from st-path, it is hard to see some properties that enables us to find some pivot vertex makes the extension problem easy v 1, ¬v 2
References st-path, cycle D. Eppstein, Finding the k Shortest Paths, FOCS 94, 154 -165 (1994) D. B. Johnson, Finding All the Elementary Circuits of a Directed Graph, SIAM J. Comp. 4, 77 -84 (1975) R. C. Read and R. E. Tarjan, Bounds on Backtrack Algorithms for Listing Cycles, Paths, and Spanning Trees, Networks 5, 237 -252 (1975)
References Clique E. A. Akkoyunlu, The Enumeration of Maximal Cliques of Large Graphs, SIAM J. Comp. 2, 1 -6 (1973) D. S. Johnson, M. Yannakakis, and C. H. Papadimitriou, On Generating All Maximal Independent Sets, Info. Proc. Lett. 27, 119 -123 (1988) T. Kashiwabara, S. Masuda, K. Nakajima and T. Fujisawa, Generation of Maximum Independent Sets of a Bipartite Graph and Maximum Cliques of a Circular-Arc Graph, J. Algo. 13, 161 -174 (1992) E. Tomita, A. Tanaka, H. Takahashi, The Worst-case Time Complexity for Generating all Maximal Cliques and computational experiments", Theoretical Computer Science 363, 28 -42 (2006)
References Perfect Matching K. Fukuda and T. Matsui, Finding All the Perfect Matchings in Bipartite Graphs, Appl. Math. Lett. 7, 15 -18 (1994). K. Fukuda and T. Matsui, Finding All the Minimum Cost Perfect Matchings in Bipartite Graphs, Networks 22, 461 -468 (1992) C. R. Chegireddy and H. W. Hamacher, Algorithms for Finding K-best Perfect Matchings, Discrete Appl. Math. 18, 155 -165 (1987) T. Uno, Algorithms for Enumerating All Perfect, Maximum and Maximal Matchings in Bipartite Graphs, ISAAC 97, LNCS 1350, 92 -101 (1997) T. Uno, A Fast Algorithm for Enumerating Bipartite Perfect Matchings, ISAAC 2001, LNCS 2223, 367 -379 (2001) T. Uno, A Fast Algorithm for Enumerating Non-Bipartite Maximal Matchings, J. National Institute of Informatics 3, 89 -97 (2001)
Exercises 2
Backtrack 2 -1. Explain why the algorithm Path. Starting. S does not produce a path passing through the same vertex twice 2 -2. Explain why the deletion of a vertex in step 3 of the algorithm Path. Starting. S does not affect the other iterations 2 -3. Actually, the algorithm Path. Starting. S takes only constant time for each iteration, except for the outputting process. Prove this. 2 -4. Give some example of independent set system (monotone sets) 2 -5. Give some example of problems in which backtracking algorithm doesn’t have to care the duplications
Backtrack 2 -6. Design a backtracking algorithm for the following problem: For given a sequence of numbers a 1, …, an, enumerate all its subsequence such that any two consecutive numbers ai and aj satisfies ai < aj. 2 -7. Design a backtracking algorithm for the following problem: For given a set of points in a plane, a non-crossing graph is a graph whose vertex set is the point set, and its edge set is a set of segments whose ends are on the points, such that no two segments intersects except for their ends. Enumerate all non-crossing trees for given a point set
Backtrack 2 -8. Design a backtracking algorithm for the following problem: For given a set of vectors x 1, …, xn of composed of positive integers, enumerate all sets of vectors such that their sum is no greater than given vector b. (subset X s. t. , Σx∈ X ≤ b) 2 -9. Design a backtracking algorithm for the following problem: For given a set of rectangles and a square, enumerate all possible locations of the subset of the rectangles s. t. no two rectangles overlap. The left-up corner of each rectangle has to be placed at an integer grid point, and the edges of rectangles has to axis parallel (so, 90 degree rotation is allowed).
Backtrack 2 -10. Design a backtrack algorithm for the following problem, and analyze its time complexity For given a graph, enumerate all matchings of the graph 2 -11. Design a backtrack algorithm for the following, and analyze its time complexity For given a sequence of letters, enumerate all its subsequences that form palindromes, i. e. , forming a, b, c, …, d, d, …, c, b, a
Binary Partition 2 -12. Design a binary partition algorithm for the following problem, and analyze its time complexity For given a set of points in a plane, enumerate all non-crossing spanning trees 2 -13. Design a binary partition algorithm for the following problem, and analyze its time complexity For given a set of points in a plane, and two points s and t, enumerate all non-crossing s-t paths (simple paths whose ends are s and t)
Binary Partition 2 -14. For two perfect matchings M and M’ of a bipartite graph G, the symmetric difference between M and M’ is composed of disjoint cycles. Further, the symmetric difference between M and an alternating cycle in which edges of M and edges not in M appear alternatively results a perfect matching different from M. Design a binary partition algorithm by using this fact. 2 -15. Design a binary partition algorithm for the following problem, and analyze its time complexity For a given partial order, a chain is a sequence of elements e 1, …, em s. t. , ei < ei+1 holds for any i. Enumerate all maximal chains that are included in no other chain
Exercises 2 -16. Design a binary partition algorithm for the following, and analyze its time complexity For a connected graph, a minimal cut is a partition of vertices such that the subgraphs induced by each group is connected. For given two vertices s and t, enumerate all minimal cuts s. t. one component includes s and not t 2 -17. Design a binary partition algorithm for the following problem, and analyze its time complexity For given a graph such that each edge is colored, enumerate all matchings of the graph s. t. any two edges have different colors
Exercises 2 -18. Prove the NP-completeness of the extension problem of maximal clique 2 -19. Prove the NP-completeness of the extension problem of stpath 2 -20. Prove the NP-completeness of the extension problem of minimal dominating set (equivalent to minimal set covering, hypergraph transversal, dualization of monotone Boolean formula)
- Slides: 66