Ryan ODonnell CMU IAS Yi Wu CMU IBM

Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Solving linear equations • Given a set of linear equations over reals, is there a solution satisfying all the equations? – Easy : Gaussian elimination. Noisy version • Given a set of linear equations for which there is a solution satisfying 99% of the equations, – can we find a solution that satisfies at least 1% of the equations? • I. e. 99% vs 1% approximation algorithm for linear equations over reals?

Hardness of Max-3 Lin(q) • Theorem. [Håstad '01] Given a set of linear equations modulo q, it is NP-hard to distinguish between – there is a solution satisfying (1 - ε)-fraction of the equations – no solution satisfies more than (1/q + ε)-fraction of the equations • Equations are sparse, and are of the form xi + xj - xk = c (mod q) • (1 - ε) vs (1/q + ε) approx. for Max-3 Lin(q) is NP-Hard • A 3 -query PCP of completeness (1 - ε), soundness (1/q + ε)

Sparser equations: Max-2 Lin(q) • Theorem. [KKMO '07] Assuming Unique Games Conjecture, for any ε, δ > 0, there exists q > 0, such that (1 - ε) vs δ approx. for Max-2 Lin(q) is NP-Hard

Max-3 Lin Max-2 Lin over [q] (1 - ε) vs (1/q + ε) NP-hardness [Håstad '01] (1 - ε) vs δ UG-hardness [KKMO '07] over integers/reals ? ?

Equations over integers: Max-3 Lin(Z) • Approximate Max-3 Lin/Max 2 Lin over large domains? • Intuitively, it should be harder, because when domain size increases, – soundness becomes smaller in both [Håstad '01] and [KKMO '07] • Obstacle of getting hardness – "Long code" becomes too long (even infinitely long)

Hardness of Max-3 Lin(Z) • Theorem. [Guruswami-Raghavendra '07] For all ε, δ > 0, it is NP-Hard to (1 - ε) vs δ approximate Max-3 Lin(Z) – 3 -query PCP over integers – Implies the hardness for Max-3 Lin(R) • Proof follows [Håstad '01], but much more involved – derandomized Long Code testing – Fourier analysis with respect to an exponential distribution on Z+

Max-3 Lin Max-2 Lin over [q] (1 - ε) vs (1/q + ε) NP-hardness [Håstad '01] (1 - ε) vs δ UG-hardness [KKMO '07] over integers/reals (1 - ε) vs δ NP-hardness [GR '07] ?

Unique Games over Integers? • Can we use the techniques in [Guruswami-Raghavendra '07] prove a (1 - ε) vs δ UG-hardness for Max-2 Lin(Z)? – Seems difficult – Open question from Raghavendra's thesis [Raghavendra '09] :

Our results • Relatively easy to modify the KKMO proof to get – Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2 Lin(Z) • Also applies to Max-2 Lin over reals and large domains – Simpler proof (and better parameters) of Max 3 Lin(Z) hardness

Dictatorship Test • Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2 Lin(Z) • By [KKMO '07], only need to design a (1 - ε) vs δ 2 -query dictatorship test over integers.

Dictatorship Test (cont'd) • f: [q]d -> Z is called a dictator if f(x 1, x 2, . . . , xd) = xi (for some i) • Dictatorship test over [q]: a distribution over equations f(x) - f(y) = c (mod q) – Completeness: for dictators, Pr[equation holds] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[equation holds] < δ (1 - ε) vs δ hardness of Max-2 Lin(q)

Dictatorship Test over Integers • A distribution over equations f(x) - f(y) = c – Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • It is UG-Hard to distinguish between – a Max-2 Lin(Z) instance is (1 - ε)-satisfiable – the instance is not δ-satisfiable even when the equations are modulo q

Recap of KKMO Dictatorship Test

Back to KKMO Dictatorship Test • Dictatorship test over [q]: a distribution over equations f(x) - f(y) = c (mod q) • Completeness: for dictators, Pr[equation holds] ≥ 1 - ε • Soundness: for functions far from dictators, Pr[equation holds] < δ • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Test f(x) - f(y) = 0 (mod q)

Back to KKMO Dictatorship Test (cont'd) • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Test f(x) - f(y) = 0 (mod q) • Soundness analysis "Majority Is Stablest" Theorem [MOO '05] – If f is far from dictators and "β-balanced", then Pr[f passes the test] < βε/2 – f is β-balanced : Pr[f(x) = a mod q] < β for all 0 ≤ a < q

Back to KKMO Dictatorship Test (cont'd) • KKMO Test • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Test f(x) - f(y) = 0 (mod q) • Soundness analysis – "Folding" trick: to make sure f is β-balanced – Idea: when query f(x) = f(x 1, x 2, . . . , xn), return g(x) = f(0, (x 2 - x 1) mod q, . . . , (xn - x 1) mod q) + x 1 – Dictators not affected in completeness analysis – g(x) is 1/q-balanced

Dictatorship Test for Max-2 Lin(Z) • A distribution over equations f(x) - f(y) = c – Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • • If we use KKMO test. . . – Soundness: the same, – Completeness does not hold, because • when query f(x), get g(x) = (xi - x 1) mod q + x 1 • when query f(y), get g(y) = (yi - y 1) mod q + y 1 Max-2 Lin(q): Pr[g(x) - g(y) = 0 mod q] ≥ 1 - ε Max-2 Lin(Z): Pr[g(x) - g(y) ≠ 0] ≥ Pr["wrap-around" (exactly one of g(x), g(y) ≥ q)] ≈ 1/2

Our method Step I Introducing the new "active folding"

The new "active folding" • KKMO Test with active folding mod q • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Pick c, c' ∈ [q] by random, test f(x 1 - c, . . . , xn - c) + c = f(y 1 - c', . . . , yn - c') + c' (mod q) • Completeness: • Soundness: – Claim. g(x) = f(x 1 - c, . . . , xn - c) + c is 1/q-balanced – Proof. Prx, c[f(x 1 - c, . . . , xn - c) + c = a mod q] = Ec [Prx[f(x 1 - c, . . . , xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q

Our method Step II "Partial active folding"

"Partial active folding" • KKMO Test with partial active folding for Max-2 Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Pick c, c' ∈ [q 0. 5] by random, test f(x 1 - c, . . . , xn - c) + c = f(y 1 - c', . . . , yn - c') + c' • Completeness: – f(x 1 - c, . . . , xn - c) + c = (xi - c) mod q + c = (xi - c) + c = xi w. p. 1 - 1/q 0. 5 – f(y 1 - c', . . . , yn - c') + c' = yi w. p. 1 - 1/q 0. 5 Pr[f(x 1 -c, . . . , xn-c)+c = f(y 1 -c', . . . , yn-c')+c'] ≥ 1 - ε - 2/q 0. 5

"Partial active folding" (cont'd) • KKMO Test with partial active folding for Max-2 Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Pick c, c' ∈ [q 0. 5] by random, test f(x 1 - c, . . . , xn - c) + c = f(y 1 - c', . . . , yn - c') + c' • Completeness: • Soundness: – Claim. g(x) = f(x 1 - c, . . . , xn - c) + c is 1/q 0. 5 -balanced – Proof. Prx, c[f(x 1 - c, . . . , xn - c) + c = a mod q] = Ec [Prx[f(x 1 - c, . . . , xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q 0. 5

"Partial active folding" (cont'd) • KKMO Test with partial active folding for Max-2 Lin(Z) • Pick x ∈ [q]d by random • Get y by rerandomizing each coordinate of x w. p. ε • Pick c, c' ∈ [q 0. 5] by random, test f(x 1 - c, . . . , xn - c) + c = f(y 1 - c', . . . , yn - c') + c' • Completeness: • Soundness: – Claim. g(x) = f(x 1 - c, . . . , xn - c) + c is 1/q 0. 5 -balanced – By Majority Is Stablest Theorem, when f is far from dictators Pr[f(x 1 -c, . . . , xn-c)+c = f(y 1 -c', . . . , yn-c')+c' mod q] < 1/qε/4

Application to Max-3 Lin(Z) Key Idea in Max-2 Lin(Z): "Partial folding" to deal with "wrap-around" event

Håstad's reduction for Max-3 Lin(q) • Hastad's Matching Dictatorship Test for f: [q]L -> Z, g : [q]R -> Z, π : [R] -> [L] • Pick x ∈ [q]L , y ∈ [q]R, by random • Let z∈[q]R, s. t. zi = (yi + xπ(i)) mod q • Rerandomizing each coordinate of x, y, z w. p. ε • Test f(0, x 2 - x 1, . . . , xn - x 1) + x 1 + g(y) = g(z) mod q • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f, g pass the test] ≥ 1 - 3ε • Soundness: if f and g far from being "matching dictators" Pr[f, g pass the test] < 1/q + δ (1 - 3ε) vs (1/q + δ) NP-Hardness of Max-3 Lin(q)

Our reduction for Max-3 Lin(Z) • Matching Dictatorship Test with partial active folding for f: [q 2]L -> Z, g : [q 3]R -> Z, π : [R] -> [L] • Pick x ∈ [q 2]L , y ∈ [q 3]R, by random • Let z∈[q 3]R, s. t. zi = (yi + xπ(i)) mod q • Rerandomizing each coordinate of x, y, z w. p. ε • Pick c ∈ [q] by random • Test f(x 1 - c, . . . , xn - c) + c + g(y) = g(z) • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f(x 1 - c, . . . , xn - c) + c + g(y) = g(z)] ≥ 1 - 3ε - 2/q • Soundness: if f and g far from being "matching dictators" Pr[f(x 1 - c, . . . , xn - c) + c + g(y) = g(z) mod q] < 1/q + δ (1 -3ε-2/q) vs (1/q+δ) NP-Hardness of Max-3 Lin(Z)

The End. Any questions?