Approximation Algorithms for PrizeCollecting Forest Problems with Submodular

Approximation Algorithms for Prize-Collecting Forest Problems with Submodular Penalty Functions Chaitanya Swamy University of Waterloo Joint work with Yogeshwer Sharma David Williamson Cornell University

Prize-collecting Steiner tree (PCST) Given: graph G=(V, E), edge costs ce ≥ 0, root rÎV, penalties pv ≥ 0 on vertices Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce ++∑penalty cost of edges picked of nodes v not connected to r pvdisconnected from r

Prize-collecting Steiner tree (PCST) Given: graph G=(V, E), edge costs ce ≥ 0, root rÎV, penalties pv ≥ 0 on vertices Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce ++∑penalty cost of edges picked of nodes v not connected to r pvdisconnected from r r

Prize-collecting Steiner tree (PCST) Given: graph G=(V, E), edge costs ce ≥ 0, root rÎV, penalties pv ≥ 0 on vertices Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce ++∑penalty cost of edges picked of nodes v not connected to r pvdisconnected from r r Bienstock et al. : gave a 3 approx. LP-rounding algorithm Goemans-Williamson (GW): gave a primal-dual 2 -approx. algorithm

PCST with submodular penalty f’n. Given: graph G=(V, E), edge costs ce ≥ 0, root rÎV, penalty is given by a set-function p : 2 V ® ≥ 0 p(A): penalty if set AÍV is disconnected from r p is submodular: p(A)+p(B) ≥ p(A È B)+p(A Ç B) e. g. , p(A) = min(|A|, M) Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce + p({v not connected to r}) r • Generalizes penalty function of PCST • Introduced by Hayrapetyan-STardos: gave a 2 -approximation algorithm by extending GW primal-

Prize-collecting Steiner forest (PCSF) Given: graph G=(V, E), edge costs ce ≥ 0, source-sink pairs si-ti penalties pi ≥ 0 on each si-ti pair Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce + ∑i: si not connected to ti in F pi

Prize-collecting Steiner forest (PCSF) Given: graph G=(V, E), edge costs ce ≥ 0, source-sink pairs si-ti penalties pi ≥ 0 on each si-ti pair Goal: choose a set of edges F Í E so as to minimize ∑eÎF ce + ∑i: si not connected to ti in F pi • Generalizes connectivity function of PCST • Introduced by Jain-Hajiaghayi: gave a 3 -approx. primal-dual algorithm

General framework for Prize-Collecting Forest Problems Prize-collecting Steiner tree PCST with submodular penalty function Prize-collecting Steiner forest Prize-Collecting Forest (PCF) – connectivity function: arbitrary 0 -1 function – penalty function: submodular function on collections of sets of vertices

Prize-Collecting Forest (PCF) Given: graph G=(V, E) (|V|=n), edge costs ce ≥ 0, • connectivity function f: 2 V ® {0, 1} f(S)=1 Þ need an edge from border of S, d(S) : = {(u, v)ÎE: exactly one of u, v is in S} • penalty function p: 22 V ® ≥ 0 p(S): penalty if collection S of subsets is violated Goal: choose a set of edges F Í E so as to subsets minimize ∑eÎF ce + p({SÍV: f(S)=1, FÇd(S)=Æ}) Example: Prize-collecting Steiner forest f(S) = 1 iff there exists some i s. t. exactly one of si, ti ÎS p(S) = ∑ i: $SÎS that separates si-ti pi

PCF: properties of p(. ) For any 0 -1 connectivity f’n f, can define penalty function, pf(S) = M (very large #) if $SÎS with f(S)=1; and 0 o/w. Solving PCF with (f, pf) Þ solving network design problem with connectivity f’n. f Þ need certain restrictions • p(Æ)=0 on p(. ) • • • Monotonicity: if SÍT then p(S) ≤ p(T) Submodularity: p(S) + p(T) ≥ p(S È T) + p(S Ç T) Complement property: for AÍV, p({A, Ac}) = p({A}) Union property: for A, B Í V, p({A, B, A È B})=p({A, B}) Inactivity property: if f(A)=0, then p({A})=0 If f(Æ)=0, then f is 0 -1 proper iff pf satisfies above properties. p(. ) will be given as an oracle (ground set has 2|V|

Our Results • Give a primal-dual 3 -approximation algorithm – Requires novel ideas in implementation and analysis, to overcome difficulties caused due to the exponential size of the ground set of p(. ) • Give an LP-rounding 2. 54 -approximation algorithm – solving the LP relaxation poses a significant challenge n n 2 – LP has 2 constraints and 2 variables: not clear if even a basic solution has a polynomial description – Reformulate LP as a convex program, solve via ellipsoid method; evaluating objective f’n and computing a subgradient both require solving an LP n of size 2 n´ 22 – overcome difficulty by proving certain structural

An Integer Program xe : indicates if edge e is picked z. S : indicates if penalty is incurred for collection S Í 2 V Minimize ∑e cexe + ∑S p(S)z. S subject to ∑eÎd(S) xe + ∑S: SÎS z. S each SÍV xe, z. S Î {0, 1} ≥ f(S) for each e, S

A Linear Program xe : indicates if edge e is picked z. S : indicates if penalty is incurred for collection S Í 2 V Minimize ∑e cexe + ∑S p(S)z. S (PCF-LP) subject to ∑eÎd(S) xe + ∑S: SÎS z. S ≥ f(S) for each SÍV xe, z. S Î {0, 1} for each e, S xe, z. S ≥ 0 for each e, S n 2 • LP has 2 variables and 2 n constraints • Not clear if even a basic solution has a polynomial-size description – what does “solving the LP” mean?

A Compact Formulation xe : indicates if edge e is picked z. S : indicates if penalty is incurred for collection S Í 2 V Minimizeh(x) : = ∑e cexe + g(x) s. t. 0 ≤ xe ≤ 1 for each e (PCF- CP) where, g(x) : = min ∑S p(S)z. S s. t. SÍV (Pen-P) ∑S: SÎS z. S ≥ f(S) – ∑eÎd(S) xe for each z. S ≥ 0 is a convex program for each e, S g(x) is convex, so (PCF-CP) Equivalent to earlier LP.

The Overall Strategy 1. Get an optimal (or (1+e)-optimal solution) x to the convex program using the ellipsoid method. 2. Round fractional solution x to integer solution – need that f is 0 -1 proper f’n, or is weakly-submodular – use 2 -approx. algorithm for the network-design Obtain a 2. 54 -approximation algorithm foror problem without penalties (Goemans-Williamson Jain). the prize-collecting forest problem.

The Ellipsoid Method Min h(x) subject to xÎP. P Start with ball containing polytope P. yi = center of current ellipsoid.

The Ellipsoid Method Min h(x) subject to xÎP. Start with ball containing polytope P. yi = center of current ellipsoid. If yi is infeasible, use violated inequality to chop off infeasible half -ellipsoid. P New ellipsoid = min. volume ellipsoid containing “unchopped” half-ellipsoid.

The Ellipsoid Method Min h(x) subject to xÎP. Start with ball containing polytope P. yi = center of current ellipsoid. If yi is infeasible, use violated inequality to chop off infeasible half -ellipsoid. If yi ÎP – how to make progress? P New ellipsoid = min. volume ellipsoid containing “unchopped” half-ellipsoid.

The Ellipsoid Method Min h(x) subject to xÎP. h(x) ≤ h(yi) yi P Start with ball containing polytope P. yi = center of current ellipsoid. If yi is infeasible, use violated inequality. If y ÎP – how to make progress? i add inequality h(x) ≤ h(yi)? Separation becomes difficult.

The Ellipsoid Method Min h(x) subject to xÎP. h(x) ≤ h(yi) yi Start with ball containing polytope P. yi = center of current ellipsoid. If yi is infeasible, use violated inequality. If y ÎP – how to make progress? i add inequality h(x) ≤ h(yi)? Separation becomes difficult. Let d = subgradient at yi. use subgradient cut d. (x–yi) ≤ 0. d Generate new min. volume ellipsoid. m d Î is a subgradient of h(. ) at u, if for every v, h(v)-h(u) ≥ d. (v-u). P

The Ellipsoid Method Min h(x) subject to xÎP. Start with ball containing polytope P. yi = center of current ellipsoid. If yi is infeasible, use violated inequality. If y ÎP – how to make progress? i add inequality h(x) ≤ h(yi)? Separation becomes difficult. x 1 x* x 2 Let d = subgradient at yi. use subgradient cut d. (x–yi) ≤ 0. Generate new min. volume ellipsoid. m d Î is a subgradient of h(. ) at u, if for every v, h(v)-h(u) ≥ dx. (v-u). , x , …, x : points in P. Can show, min h(x ) ≤ OPT+r. P 1 2 k i=1…k i

Computing a subgradient h(x) : = ∑e cexe + g(x) : = min. ∑S p(S)z. S s. t. ∑S: SÎS z. S ≥ f(S) – ∑eÎd(S) xe "SÍV z. S ≥ 0 "S

Computing a subgradient h(x) : = ∑e cexe + g(x) : = min. ∑S p(S)z. S = s. t. ∑S: SÎS z. S ≥ f(S) – ∑eÎd(S) xe z. S ≥ 0 "SÍV "S max. ∑S (f(S) – ∑eÎd(S) xe) y. S s. t. ∑SÎS y. S ≤ p(S) "S y. S ≥ 0 "S Consider point uÎ m. Let y º optimal dual solution to g(u). So h(u) = ∑e ceue + ∑S (f(S) – ∑eÎd(S) ue) y. S = ∑e deue + ∑S f(S)y. S where de = ce – ∑S: eÎd(S) y. S. At any point vÎ m, y is a feasible solution to dual of g(v), so h(v) ≥ ∑e ceve + ∑S (f(S) – ∑eÎd(S) ve) y. S = ∑e deve + ∑S f(S)y. S Lemma: For any point vÎ m, we have h(v) – h(u) ≥ d. (v-u). Þ d is a subgradient of h(. ) at point u.

Solving the dual Notation: x(d(S))= ∑eÎd(S) xe g(x) = max ∑S [f(S) – x(d(S))]y. S s. t. ∑SÎS y. S ≤ p(S) y. S ≥ 0 2 n variables p. S(A) = p(SÈ{A}) – p(S) (Pen-D) for all SÍ2 V for all S n 2 2 Bad : Dual has and constraints Good: It is a polymatroid: p(. ) is a monotone submodular f’n. Þ Edmonds’ greedy algorithm yields optimal solution – Sort the sets S in decreasing order of [f(S)-x(d(S))] – For the i-th set Si, if [f(Si)-x(d(Si))] > 0, set y. Si = p{S 1, …Si(Si) 1} Bad : Reduces complexity to 2 n, but still not polytime Good: Show that $optimal solution where the sets S with

Useful properties of p(. ) • If A, BÎS, then p. S(T) = p. S(Tc) = 0 for all sets T in {AÈB, AÇB, AB, BA, Ac, Bc} – due to complementarity and union properties • If p({A}) = 0, then for any BÍV, p. SÈ{A}({B}) = p. S({B}) – due to submodularity Þ ordering of sets A with f(A)=0 is irrelevant • If p. SÈ{A}({B}) = p. SÈ{B}({A}) = 0, then for any set TÍV, p. SÈ{A}({T}) = p. SÈ{B}({T}) – by submodularity

Solving the dual (contd. ) Structural lemma yields following algorithm: • Initialize y. S = 0 for all sets S, laminar family L ¬ Æ. • While $set S that does not cross any set of L – find T = argmin {x(d(S)): S does not cross L} – if x(d(T)) ≥ 1 return; else set y. T = p. L({T}), L ¬ LÈ{T} Theorem: y is an optimal solution to (Pen-D). Let L' = {TÎL: y. T>0} = {T 1, …, Tk}, Ti = maximal superset of {T 1, …, Ti} s. t. p(Ti) = p({T 1, …, Ti}) Theorem: Setting z. Ti = x(d(Ti+1)) – x(d(Ti)) (x(d(Tk+1)) : = 1) for i=1, …, k, and z. S = 0 for all other S, yields an optimal solution to (Pen-P).

$Rounding procedure Given: fractional solution x, sets T 1, …, Tk – gives succinct$

Rounding procedure Given: fractional solution x, sets T 1, …, Tk – gives succinct description of collections T 1, …, Tk, and hence optimal soln. z to (Pen-P) Let aÎ[0, 1] be a parameter. – Define 0 -1 connectivity function k(S) = 1 if f(S) = 1 and ∑S: SÎS z. S < a; 0 otherwise. – Solve network design problem with connectivity function k. If f is proper or weakly-supermodular, then so is k, therefore cost of edges picked is bounded

Open Questions • Is there a compact description of the LP? Or a more efficient procedure to solve it? • Obtaining a 2 -approximation algorithm: iterative rounding may be the way to go • Applications to 2 -stage stochastic network design: can the second-stage cost be captured by a “nice” penalty function?

Thank You.