Optimization Convex Relaxations M Pawan Kumar Slides available
Optimization Convex Relaxations M. Pawan Kumar Slides available online http: //mpawankumar. info
Energy Function Label l 1 Label l 0 Va Vb Vc Random Variables V = {Va, Vb, …. } Labels L = {l 0, l 1, …. } Labelling f: {a, b, …. } �{0, 1, …} Vd
Energy Function Label l 1 Label l 0 2 4 6 3 5 2 Vb 3 Vc 7 Va Q(f) = ∑a a(f(a)) Unary Potential Vd Easy to minimize Neighbourhood
Energy Function Label l 1 Label l 0 2 4 6 3 5 2 Vb 3 Vc 7 Va Vd E : (a, b) E iff Va and Vb are neighbours E = { (a, b) , (b, c) , (c, d) }
Energy Function Label l 1 2 0 1 Label l 0 5 Va 1 0 0 4 2 Vb 2 6 3 1 3 Vc 1 4 3 1 0 7 Vd Pairwise Potential Q(f) = ∑a a(f(a))+∑(a, b) ab(f(a), f(b))
Energy Minimization 2 0 4 1 5 3 0 V 1 2 V 2 minf ∑a a(f(a)) +∑(a, b) ab(f(a), f(b))
Integer Program 2 0 4 1 5 3 0 V 1 2 V 2 minf ∑a a(f(a)) +∑(a, b) ab(f(a), f(b)) xa(i) ∈ {0, 1} Does Va take the label li (xa(i) = 1) or not (xa(i) = 0)?
Constraint 2 0 4 1 5 3 0 V 1 2 V 2 minf ∑a a(f(a)) +∑(a, b) ab(f(a), f(b)) xa(i) ∈ {0, 1} Constraint that Va can take exactly one label
Constraint 2 0 4 1 5 3 0 V 1 2 V 2 minf ∑a a(f(a)) xa(i) ∈ {0, 1} ∑i xa(i) = 1 +∑(a, b) ab(f(a), f(b))
Unary Potentials 2 0 4 1 5 ∑a ∑i a(i)xa(i) 3 0 V 1 2 V 2 minf ∑a a(f(a)) xa(i) ∈ {0, 1} ∑i xa(i) = 1 +∑(a, b) ab(f(a), f(b))
Pairwise Potentials 2 0 4 1 5 ∑(a, b) ∑i, k ab(i, k)xa(i)xb(k) 3 0 V 1 2 V 2 minf ∑a a(f(a)) xa(i) ∈ {0, 1} ∑i xa(i) = 1 +∑(a, b) ab(f(a), f(b))
Integer Program 2 0 4 1 5 3 0 V 1 minx 2 V 2 ∑a ∑i a(i)xa(i) + ∑(a, b) ∑i, k ab(i, k)xa(i)xb(k) s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1
Outline • QP Relaxation • LP Relaxation for Potts Model • LP Relaxation for Pairwise Energy • A Hierarchy of Relaxations Ravikumar and Lafferty, 2006
Unary Potential Vector 2 0 1 5 For x, total unary cost? 3 0 V 1 4 u Tx 2 V 2 Unary Potential u = [ 5 2; 2 Cost of. Cost V 1 =of 0 V 1 = 1 4 ]
Pairwise Potential Matrix 2 0 1 5 4 For x, total Pairwise cost? 3 0 V 1 ½ x. TPx 2 V 2 Pairwise Potential Matrix P 0 0 0 3 0 0 0 1 3 0 1 0 0 0 Cost of V 1 = 0 and V 1 = 0 Cost of V 1 = 0 and V 2 = 1
Integer Program 2 0 4 1 5 3 0 V 1 minx 2 V 2 u Tx s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 + ½ x. TPx Convex? No. Diagonal of P is 0
Integer Program 2 0 4 1 Consider a vector d 3 Define D = diag(d) 5 0 V 1 minx 2 V 2 u Tx s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 + ½ x. TPx
Integer Program 2 0 4 1 Consider a vector d 3 Define D = diag(d) 5 0 V 1 minx 2 V 2 u Tx s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 + ½ x. T(P+D)x
Integer Program 2 0 4 1 Consider a vector d 3 Define D = diag(d) 5 0 V 1 minx 2 V 2 (u-d)Tx + ½ x. T(P+D)x s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 Equivalent to the old problem Why? xa(i)*xa(i) = xa(i)
Integer Program 2 0 4 1 5 3 0 V 1 minx Choose an appropriate d 2 V 2 d(i) = Sum of absolute values of the i-th row of P (u-d)Tx + ½ x. T(P+D)x Convex s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 Why? Because P+D ≽ 0
QP Relaxation 2 0 4 1 5 3 0 V 1 minx Choose an appropriate d 2 V 2 d(i) = Sum of absolute values of the i-th row of P (u-d)Tx + ½ x. T(P+D)x s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 Solver? Conditional Gradient (Frank-Wolfe)
Outline • QP Relaxation – Conditional Gradient • LP Relaxation for Potts Model • LP Relaxation for Pairwise Energy • A Hierarchy of Relaxations Frank and Wolfe, 1956
Conditional Gradient minx f(x) s. t. x∈X Objective f(x) is assumed smooth Gradients defined everywhere Feasible region is convex and bounded
Conditional Gradient minx f(x) s. t. x∈X Compute gradient g of f(x) at current xt Compute conditional gradient
Conditional Gradient ct = argminx s. t. g Tx x∈X Update xt+1 = ηtxt + (1 -ηt)ct xt+1 ∈ X No need for projection Why?
CG for QP Initialize x 0 and t = 0 While objective can be reduced g = (u-d) + (P+D)xt Easy Why? ct = argminx g. Tx s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 Update xt+1 = ηtxt + (1 -ηt)ct t=t+1 We can compute optimal ηt Why?
Outline • QP Relaxation • LP Relaxation for Potts Model • LP Relaxation for Pairwise Energy • A Hierarchy of Relaxations Kleinberg and Tardos, 1999
Integer Program 2 0 4 1 5 3 0 V 1 minx ab(i, k) = wab, if i ≠ k = 0, if i = k 2 V 2 ∑a ∑i a(i)xa(i) + s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 ∑(a, b) ∑i, k ab(i, k)xa(i)xb(k)
Integer Program 2 0 4 1 5 3 0 V 1 minx ab(i, k) = wab, if i ≠ k = 0, if i = k 2 V 2 ∑a ∑i a(i)xa(i) + ½ ∑(a, b) ∑i wab |xa(i)-xb(i)| s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1
LP Relaxation 2 0 4 1 5 3 0 V 1 minx ab(i, k) = wab, if i ≠ k = 0, if i = k 2 V 2 ∑a ∑i a(i)xa(i) + ½ ∑(a, b) ∑i wab |xa(i)-xb(i)| s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 For 2 labels, min-cut problem Integer optimal solutions
Outline • QP Relaxation • LP Relaxation for Potts Model • LP Relaxation for Pairwise Energy • A Hierarchy of Relaxations Chekuri et al. , 2001
Integer Program 2 0 4 1 5 3 0 V 1 minx 2 V 2 ∑a ∑i a(i)xa(i) + s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 ∑(a, b) ∑i, k ab(i, k)xa(i)xb(k)
Integer Program 2 0 4 1 5 3 0 V 1 minx 2 V 2 ∑a ∑i a(i)xa(i) + s. t. xa(i) ∈ {0, 1} ∑i xa(i) = 1 ∑(a, b) ∑i, k ab(i, k)xab(i, k) ∈ {0, 1} ∑k xab(i, k) = xa(i)
LP Relaxation 2 0 4 1 5 3 0 V 1 minx UGC Hardness Guarantees 2 V 2 ∑a ∑i a(i)xa(i) + s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 ∑(a, b) ∑i, k ab(i, k)xab(i, k) ∈ [0, 1] ∑k xab(i, k) = xa(i) Marginalization constraint
Outline • QP Relaxation • LP Relaxation for Potts Model • LP Relaxation for Pairwise Energy • A Hierarchy of Relaxations Sherali and Adams, 1990
LP Relaxation minx ∑a ∑i a(i)xa(i) + + ∑(a, b) ∑i, k ab(i, k)xab(i, k) ∑(a, b, c) ∑i, k, m abc(i, k, m)xab(i, k, m) s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 xab(i, k) ∈ [0, 1] xabc(i, k, m) ∈ [0, 1] ∑k xab(i, k) = xa(i) ∑k, m xabc(i, k, m) = xa(i) ∑m xabc(i, k, m) = xab(i, k)
LP Relaxation minx ∑a ∑i a(i)xa(i) + s. t. xa(i) ∈ [0, 1] ∑i xa(i) = 1 ∑(a, b) ∑i, k ab(i, k)xab(i, k) ∈ [0, 1] xabc(i, k, m) ∈ [0, 1] ∑k xab(i, k) = xa(i) ∑k, m xabc(i, k, m) = xa(i) ∑m xabc(i, k, m) = xab(i, k)
LP Relaxation Hierarchy Higher and higher orders of marginalizations Eventually you will find a tight relaxation But it may be exponential in size Exponential blow-up very rare in practice Real-world is full of structure
Questions?
- Slides: 39