The Monte Carlo method 1 1 1 1

  • Slides: 42
Download presentation
The Monte Carlo method 1

The Monte Carlo method 1

(1, 1) (-1, 1) (0, 0) 1 Z= 0 1 (-1, -1) If X

(1, 1) (-1, 1) (0, 0) 1 Z= 0 1 (-1, -1) If X 2+Y 2 1 o/w (1, -1) • (X, Y) is a point chosen uniformly at random in a 2 2 square centered at the origin (0, 0). • P(Z=1)= /4. 2

n n n Assume we run this experiment m times, with Zi being the

n n n Assume we run this experiment m times, with Zi being the value of Z at the ith run. If W= i. Zi, then W’=4 W/m is an estimate of . 3

n n By Chernoff bound, Def: A randomized algorithm gives an ( , )approximation

n n By Chernoff bound, Def: A randomized algorithm gives an ( , )approximation for the value V if the output X of the algorithm satisfies Pr[|X-V| V] 1 -. 4

n The above method for estimating gives an ( , )-approximation, as long as

n The above method for estimating gives an ( , )-approximation, as long as <1 and m large enough. 5

n n n Thm 1: Let X 1, …, Xm be independent and identically

n n n Thm 1: Let X 1, …, Xm be independent and identically distributed indicator random variables, with =E[Xi]. If m (3 ln(2/ ))/ 2 , then I. e. m samples provide an ( , )-approximation for . Pf: Exercise! 6

n n Def: FPRAS: fully polynomial randomized approximation scheme. A FPRAS for a problem

n n Def: FPRAS: fully polynomial randomized approximation scheme. A FPRAS for a problem is a randomized algorithm for which, given an input x and any parameters and with 0< , <1, the algorithm outputs an ( , )-approximation to V(x) in time poly(1/ , ln -1, |x|). 7

n n n Def: DNF counting problem: counting the number of satisfying assignments of

n n n Def: DNF counting problem: counting the number of satisfying assignments of a Boolean formula in disjunctive normal form (DNF). Def: a DNF formula is a disjunction of clauses C 1 C 2 … Ct, where each clause is a conjunction of literals. Eg. (x 1 x 2 x 3) (x 2 x 4) (x 1 x 3 x 4) 8

n n Counting the number of satisfying assignments of a DNF formula is actually

n n Counting the number of satisfying assignments of a DNF formula is actually #Pcomplete. Counting the number of Hamiltonian cycles in a graph and counting the number of perfect matching in a bipartite graph are examples of #P-complete problems. 9

A naïve algorithm for DNF counting problem: Input: A DNF formula F with n

A naïve algorithm for DNF counting problem: Input: A DNF formula F with n variables. Output: Y = an approximation of c(F) n 1. X 0. The number of satisfying Assigments of F. 2. For k=1 to m, do: (a) Generate a random assignment for the n variables, chosen uniformly at random. (b) If the random assignment satisfies F, then X X+1. 3. Return Y (X/m)2 n. 10

Analysis 1 X k= 0 If the k-th iteration in the algorithm generated a

Analysis 1 X k= 0 If the k-th iteration in the algorithm generated a satisfying assignment; o/w. n Pr[Xk=1]=c(F)/2 n. n Let X= and then E[X]=mc(F)/2 n. n 11

Analysis n n n By Theorem 1, X/m gives an ( , )approximation of

Analysis n n n By Theorem 1, X/m gives an ( , )approximation of c(F)/2 n, and hence Y gives an ( , )-approximation of c(F), when m 3 2 nln(2/ ))/ 2 c(F). If c(F) 2 n/poly(n), then this is not too bad, m is a poly. But, if c(F)=poly(n), then m=O(2 n/c(F))! 12

Analysis n n n Note that if Ci has li literals then there are

Analysis n n n Note that if Ci has li literals then there are exactly 2 n-li satisfying assignments for Ci. Let SCi denote the set of assignments that satisfy clause i. U={(i, a): 1 i t and a SCi}. |U|= Want to estimate Define S={(i, a): 1 i t and a SCi, a SCj for j<i}. 13

DNF counting algorithm II: n n 1. 2. 3. Input: A DNF formula F

DNF counting algorithm II: n n 1. 2. 3. Input: A DNF formula F with n variables. Output: Y: an approximation of c(F). X 0. For k=1 to m do: (a) With probability choose, uniformly at random an assignment a SCi. (b) If a is not in any SCj, j<i, then X X+1. Return Y 14

DNF counting algorithm II: n n n Note that |U| t|S|. Why? Let Pr[i

DNF counting algorithm II: n n n Note that |U| t|S|. Why? Let Pr[i is chosen]= Then Pr[(i, a) is chosen] =Pr[i is chosen] Pr[a is chosen|i is chosen] = 15

DNF counting algorithm II: n n Thm: DNF counting algorithm II is an FPRAS

DNF counting algorithm II: n n Thm: DNF counting algorithm II is an FPRAS for DNF counting problem when m= (3 t/ 2)ln(2/ ). Pf: Step 2(a) chooses an element of U uniformly at random. The probability that this element belongs to S is at least 1/t. Fix any , >0, and let m= (3 t/ 2)ln(2/ ). poly(t, 1/ , ln(1/ )) 16

DNF counting algorithm II: n n The processing time of each sample is poly(t).

DNF counting algorithm II: n n The processing time of each sample is poly(t). By Thm 1, with m samples, X/m gives an ( , )approximation of c(F)/|U| and hence Y gives an ( , )-approximation of c(F). 17

Counting with Approximate Sampling n n Def: w: the output of a sampling algorithm

Counting with Approximate Sampling n n Def: w: the output of a sampling algorithm for a finite sample space . The sampling algorithm generates an -uniform sample of if, for any subset S of , |Pr[w S]|S|/| || . 18

n n n Def: A sampling algorithm is a fully polynomial almost uniform sampler

n n n Def: A sampling algorithm is a fully polynomial almost uniform sampler (FPAUS) for a problem if, given an input x and >0, it generates an uniform sample of (x) in time poly(|x|, ln(1/ )). Consider an FPAUS for independent sets would take as input a graph G=(V, E) and a parameter . The sample space: the set of all independent sets in G. 19

n n Goal: Given an FPAUS for independent sets, we construct an FPRAS for

n n Goal: Given an FPAUS for independent sets, we construct an FPRAS for counting the number of independent sets. Assume G has m edges, and let e 1, …, em be an arbitrary ordering of the edges. Ei: the set of the first i edges in E and let Gi=(V, Ei). (Gi): denote the set of independent sets in Gi. 20

n n | (G 0)|=2 n. Why? To estimate | (G)|, we need good

n n | (G 0)|=2 n. Why? To estimate | (G)|, we need good estimates for 21

n n n Let be estimate for ri, then To evaluate the error, we

n n n Let be estimate for ri, then To evaluate the error, we need to bound the ratio To have an ( , )-approximation, we want Pr[|R -1| ] 1 -. 22

n n n Lemma: Suppose that for all i, 1 i m, ( /2

n n n Lemma: Suppose that for all i, 1 i m, ( /2 m, /m)-approximation for ri. Then Pr[|R-1| ] 1 -. Pf: For each 1 i m, we have is an 23

n Equivalently, 24

n Equivalently, 24

1. Estimating ri: Input: Graph Gi-1=(V, Ei-1) and Gi=(V, Ei) Output: = an approximation

1. Estimating ri: Input: Graph Gi-1=(V, Ei-1) and Gi=(V, Ei) Output: = an approximation of ri. X 0 2. Repeat for M= (1296 m 2/ 2)ln(2 m/ ) independent trials: 3. (a) Generate an ( /6 m)-uniform sample from (Gi-1). (b) If the sample is an independent set in Gi, then X X+1. Return X/M. n n n 25

n n n Lemma: When m 1 and 0< 1, the procedure for estimating

n n n Lemma: When m 1 and 0< 1, the procedure for estimating ri yields an ( /2 m, /m)approximation for ri. Pf: Suppose Gi-1 and Gi differ in that edge {u, v} is in Gi but not in Gi-1. (Gi) (Gi-1). An independent set in (Gi-1) (Gi) contains both u and v. 26

n Associate each I (Gi-1) (Gi) with an independent set I{v} (Gi). Note that

n Associate each I (Gi-1) (Gi) with an independent set I{v} (Gi). Note that I’ (Gi) is associated with no more than one independent set I’ {v} (Gi 1) (Gi), thus | (Gi-1) (Gi)| | (Gi)|. n It follows that n 27

n n Let Xk=1 if the k-th sample is in (Gi) and 0 o/w.

n n Let Xk=1 if the k-th sample is in (Gi) and 0 o/w. Because our samples are generated by an ( /6 m)-uniform sampler, by definition, 28

n By linearity of expectations, Since ri 1/2, we have 29

n By linearity of expectations, Since ri 1/2, we have 29

n n If M 3 ln(2 m/ )/( /12 m)2(1/3)=1296 m 2 2 ln(2

n n If M 3 ln(2 m/ )/( /12 m)2(1/3)=1296 m 2 2 ln(2 m/ ), then Equivalently, with probability 1 - /m, -----(1) 30

n As n Using, ri 1/2, then we have -----(2) 31

n As n Using, ri 1/2, then we have -----(2) 31

n Combining (1) and (2), with probability 1 - /m, n This gives the

n Combining (1) and (2), with probability 1 - /m, n This gives the desired ( /2 m, /m)-approximation. n Thm: Given FPAUS for independent sets in any graph, we can construct an FPRAS for the number of independent sets in a graph G. 32

The Markov Chain Monte Carlo Method n n The Markov Chain Monte Carlo Method

The Markov Chain Monte Carlo Method n n The Markov Chain Monte Carlo Method provides a very general approach to sampling from a desired probability distribution. Basic idea: Define an ergodic Markov chain whose set of states is the sample space and whose stationary distribution is the required sampling distribution. 33

n n n Lemma: For a finite state space and neighborhood structure {N(x)|x },

n n n Lemma: For a finite state space and neighborhood structure {N(x)|x }, let N=maxx |N(x)|. Let M be any number such that M N. Consider a Markov chain where 1/M if x y and y N(x) Px, y= 0 if x y and y N(x) 1 -N(x)/M if x=y. If this chain is irreducible and aperiodic, then the stationary distribution is the uniform distribution. 34

n n n Pf: For x y, if x= y, then x. Px, y=

n n n Pf: For x y, if x= y, then x. Px, y= y. Py, x, since Px, y=Py, x=1/M. It follows that the uniform distribution x=1/| | is the stationary distribution by the following theorem. 35

Thm: n P: transition matrix of a finite irreducible and ergodic Markov chain. If

Thm: n P: transition matrix of a finite irreducible and ergodic Markov chain. If there are nonnegative numbers =( 0, . . , n) such that and if, for any pair of states i, j, i. Pi, j= j. Pj, i, then is the stationary distribution corresponding to P. Pf: n Since , it follows that is the unique stationary distribution of the Markov Chain. n 36

n 1. 2. Eg. Markov chain with states from independent sets in G=(V, E).

n 1. 2. Eg. Markov chain with states from independent sets in G=(V, E). X 0 is an arbitrary independent set in G. To compute Xi+1: (a) choose a vertex v uniformly at random from V; (b) if v Xi then Xi+1=Xi{v}; (c) if v Xi and if adding v to Xi still gives an independent set, then Xi+1=Xi {v}; (d) o/w, Xi+1=Xi. 37

n n The neighbors of a state Xi are independent sets that differ from

n n The neighbors of a state Xi are independent sets that differ from Xi in just one vertex. Since every state is reachable from the empty set, the chain is irreducible. Assume G has at least one edge (u, v), then the state {v} has a self-loop (Pv, v>0), thus aperiodic. When x y, it follows that Px, y=1/|V| or 0, by the previous lemma, the stationary distribution is the uniform distribution. 38

The Metropolis Algorithm (When stationary distribution is nonuniform) n n Lemma: For a finite

The Metropolis Algorithm (When stationary distribution is nonuniform) n n Lemma: For a finite state space and neighborhood structure {N(x)|x }, let N=maxx |N(x)|. Let M be and number such that M N. For all x , let x>0 be the desired probability of state x in the stationary distribution. Consider a Markov chain Px, y= n (1/M)min(1, y/ x) 0 1 - y x. Px, y if x y and y N(x) if x=y. Then, if this chain is irreducible and aperiodic, then the stationary distribution is given by x. 39

n n n Pf: For any x y, if x y, then Px, y=1/M

n n n Pf: For any x y, if x y, then Px, y=1/M and Py, x=(1/M)( x/ y). It follows that Px, y=1/M=( y/ x)Py, x. x. Px, y= y. Py, x. Similarly, for x> y. Again, by the previous theorem, x’s form the stationary distribution. 40

n n n Eg. Create a Markov chain, in the stationary distribution, each independent

n n n Eg. Create a Markov chain, in the stationary distribution, each independent set I has probability proportional to |I|, for some >0. I. e. x= |Ix|/B, where Ix is the independent set corresponding to state x and B= x |Ix|. Note that, when =1, this is the uniform distribution. 41

1. 2. X 0 is an arbitrary independent set in G. To compute Xi+1:

1. 2. X 0 is an arbitrary independent set in G. To compute Xi+1: (a) choose v uniformly at random from V; (b) if v Xi then Xi+1=Xi{v} with probability min(1, 1/ ); (c) if v Xi and Xi {v} still gives an independent set, then put Xi+1=Xi {v} with probability min(1, ); (d) o/w, set Xi+1=Xi. 42