18 859 S Analysis of Boolean Functions Administrivia

18 -859 S: Analysis of Boolean Functions

Administrivia Me: Ryan O’Donnell; email: odonnell@cs. cmu. edu Office hours: Wean 7121, by appointment Web site: http: //www. cs. cmu. edu/~odonnell/boolean-analysis Mailing list: Please sign up! Instructions on web page. Blog: http: //boolean-analysis. blogspot. com Evaluation: ² About 5 problem sets. ² 2 or 2. 5 scribe notes, graded (worth equal to that of a problem set)

The Boolean boolean function

All things to all people

What: To whom: Truth Table Complexity theorists, Circuit designers x 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 f (x) 0 1 1 1 1

What: To whom: Subset of the Discrete Cube Geometers of the Cube – Combinatorialists, Coding theorists, Metric space types (with Hamming Distance)

“Concept” What: Machine Learning theorists To whom: Objects n “features” “Viagra” “Cialis” “Levitra”. com. ng “Credit” “Mortgage” “Lottery” ALL CAPS 1 1 1 0 0 1 Visit our new online pharmacy store and save up to 80% From: Tami Curran <alexsoft@gmail. com> To: <cs-251 -staff@cs. cmu. edu> Date: Nov 8 2006 - 12: 55 pm Take that! message Visit our new online pharmacy store and save up to 80% Only we offer: - All popular drugs are available (Viagra, Cialis, Levitra and much more ) - World Wide Shipping No Doctor Visits - No Prescriptions - 100% CLICK TO FIND OUT ABOUT MORE SPECIAL OFFERS AND VISIT OUR NEW ONLINE PHARMACY STORE SPAM / NOT -SPAM

What: To whom: Set System Combinatorialists, Extremal & Algebraic a set, X µ U A, a collection of subsets n element “universe” U “Set System” or “Hypergraph” or “Simplicial Complex” (if f monotone)

What: To whom: Graph Property Statistical physicists, Probabilists, Random k-SAT-ers an actual graph A property of graphs; eg. , percolation (left-right crossing) Also good for: Ising Model Erdős-Rényi random graph model Random k-SAT satisfiability (for k-reg. hypergraphs) graph with n “potential” edges

What: To whom: Voting Scheme / Social Choice Econometricists, Political scientists 0= votes n voters 1= winner majority electoral college dictatorship

What: To whom: Set of integers Number theorists, Additive combinatoricists • “How dense a set do you need to guarantee an arithmetic progression of length k? ” • “Suppose f indicates the primes; is there a nontrivial solution to f (x) f (x+a) f (x+2 a) = 1? ”

“Fourier / Harmonic Analysis of Boolean Functions” = A set of techniques for studying structural properties of boolean functions.

What does it mean for f to be… • “simple” • “fair” • “symmetric” • “spread out / concentrated” • “pseudo- or quasi-random” • “low-degree” ?

When is a Boolean function “simple”?

“Juntas” Definition: f : {0, 1}n ! {0, 1} is called an r-junta if f actually only depends on some subset of r out of n coordinates.

Fourier Analysis – Temperature As t ! 1: changes according to Heat Equation, a differential equation. Basic solutions: 1, sin(2 x), cos(2 x), sin(4 x), cos(4 x), sin(6 x), … Every f : Rn ! R expressible as linear combination of these “frequencies”.

Fourier Analysis of Boolean Functions – Displacement? As t ! 1: changes via a “Diffusion” differential equation. Basic solutions: Parity (XOR) functions on on the 2 n subsets of coordinates Every f : {0, 1}n ! R expressible as linear combination of these “frequencies”. Fourier expansion of f, Fourier coefficients of f.

Hallmarks of Fourier Analysis 1. Uniform probability distribution on {0, 1}n. 2. Discrete cube graph structure.

Energy Definition: For f : {0, 1}n ! {0, 1}, is the average sensitivity , or edge-boundary (normalized) , or total influence, or energy.

Energy Highest energy f ? Parity on all bits / its negation. I = n. Lowest energy f ? Constants. I = 0. (Homework : f “balanced” ) I( f ) ¸ 1. ) f (x) = x i, or x i. (“Dictator”) Lowest energy balanced f ? Majority? Random function? I ¼ n/2.

Connection to Circuit Complexity Theorem: [Linial-Mansour-Nisan + Håstad] If f is computable by a circuit of size S and depth D, then I( f ) · O(log D− 1 S). In particular, f 2 AC 0 ) I( f ) · polylog(n). Hence: • Parity AC 0. Majority AC 0. • Pseudorandom function generators AC 0.

Lowest Possible Energy Lowest energy balanced function that “depends essentially on all n inputs”? Example: ¼ n / log 2 n Ç “Tribes” Æ Æ Æ ¼ log 2 n − (log n) I (Tribes) = (log n) Friedgut’s Theorem: For all f : {0, 1}n ! {0, 1}, for all > 0, f is -close to a 2 O( I( f ) / ) -junta.

When is a boolean function “fair”?

Influences Definition: The influence of the i th coordinate on f is Impartial Culture (IC) assumption I. e. , probability ith voter is a “swing voter”. Proposition: I( f ) = i Infi( f ) x with bit flipped AKA Banzhaf Power Index.

Influences Infi(Parity) = Infi(Majorityn) = 1 Infi(x j) = 1 0 if i = j, else Infi(Tribesn) = For a fair voting scheme, do you want influences large or small?

Influential Coalitions Theorem: [Kahn-Kalai-Linial] If f : {0, 1}n ! {0, 1} is any balanced voting scheme, at least one candidate can bribe o(1) frac. of voters, win with probability 1 − o(1). Corollary of: KKL Theorem: For every balanced f, there is an i with Infi( f ) ¸ After collecting voters, they control with probability 1 − o(1). (Both theorems sharp: Tribes. )

Miscounted Votes Definition: Noise sensitivity of f at : flip each bit of x indep. w. p. Aside: In diffusion process: , where = ½ − ½ exp(−t).

The Best Scheme Against Miscounts Theorems: For all 0 · · ½, NS (Dictator) = as n ! 1, NS (Majorityn) ! arccos(1 − 2 ) / as n ! 1, NS (Electoral. College) ¼ as n ! 1, NS (Tribesn) ! ½ Majority Is Stablest Theorem: If f is balanced and NS ( f ) · arccos(1 − 2 ) / − , then Infi( f ) ¸ O(1/ ) for at least one i.

Applications of NS to P vs. NP Q: Is it possible that for every language L in NP, there is a poly-size family of circuits computing L on 100% of all inputs (of length n, for each n)? A: No. Assuming NP P (/poly). What about 99%? What about 75%? What about 51%? How hard is NP on average?

Avg. case NP: Slightly hard) Very hard Say f 2 NP, balanced, and “slightly hard”: best poly-size circuit is 99% right. Impagliazzo’s Hard Core Theorem: 9 H ½ {0, 1}n such that no poly-sized circuit can compute f on ¸ of size (½ + negl. ) fraction of H. Tribes Let F : {0, 1}106 n ! {0, 1} be On typical input to F, f f 2% ¢ 2 n 106 f 2 NP. (Why? ) n about 2% ¢ 106 of the f -inputs come from H. NS 2%(Tribes 106) ¼ 49% ) Theorem: F is not 51%-computable by poly-circuits.

When is a boolean function “pseudo-” or “quasi-random”?

The Opposite of Pseudorandom Given f ’s value on M random points, can you predict f at other points? xi Examples: Predict: 01010011 11010111 00101011 010100101001 11101111 ¢¢¢ f (00010101) = f (xi) 1 1 1 0 0 1 ¢¢¢ “Learning f (from random examples)” ? One idea: Take some weighted majority of known f-values, based on Hamming distance. Can this work with. M ¿ 2 n ?

Learning from Random Examples Works if f has “long-range correlations” – e. g. , small I( f ) or NS ( f ). LMN Algorithm: This will work (using an appropriate weighted majority) if M ¸ n. O(I( f )). E. g. , depth-D, poly-size circuits predictable after only Similar theorem exists for functions with small NS. examples.

Learning with Queries Goldreich-Levin Theorem: From any “one-way function” g : {0, 1}n ! {0, 1}n, can produce a “hard-core predicate”f : {0, 1}2 n ! {0, 1}. Proof by contraposition: gives a learning algorithm, using queries, for learning f ’s large Fourier coefficients. GL algorithm put to positive use in Learning Theory: Theorem: [Mansour] Poly-size DNF (depth-2 circuits) learnable with queries in time n. O(log n); Fourier techniques. Jackson’s Theorem: Improved to poly time & queries, by adding an ML technique.

Quasirandomness Fix a small set of simple statistical tests; quasirandom if you pass all of them. For graphs: Graph G with edge density p is quasirandom if, for each O(1)-size graph H, G has the roughly the “expected” number of copies of H. For boolean functions: Function f with E[ f ] = p is quasirandom if, for each O(1)-junta h : {0, 1}n ! {0, 1}, (one weak possible notion) f has roughly 0 correlation with h. (I. e. , given h(x), you’d still guess p for Pr[ f(x) = 1]. )

Quasirandomness & “Tests” Håstad’s Test: ● Pick x ~ {0, 1}n uniformly. ● Pick y ~ {0, 1}n uniformly. ● Set z = x © y. ● Set w ~ z. ● Test whether f (x) © f (y) © f (w) = 0. f balanced and random: f a Dictator: x = y = 101000011011111 © 000100000101110 z = w= 101100011110001 001100010100000 would pass with probability ½. would pass with probability 1 − . Theorem: If f is balanced and quasirandom, passes test with probability · ½ + o (1). Almost the canonical Fourier Analysis problem; where we’ll start.

Håstad’s “Hardness of Approximation” Corollary: [Håstad’s Test + “PCP machinery”] Given a system of 3 -variable linear equations mod 2, x 1 © x 3 © x 7 = 0 e. g. , x 2 © x 4 © x 7 = 1 x 1 © x 5 © x 6 = 0 x 6 © x 8 © x 9 = 0 which is 99%-satisfiable, no efficient algorithm can find a solution satisfying 51% of equations. (Unless P = NP. )

Proof Idea Test yields an NP-hardness gadget for reductions, m-coloring graphs 3 -variable mod-2 equations. block of 2 m vars, x 00 0, x 00 1, … vertex prob. 05 of testing: weight. 05: f (000) © f (011) = 0 x 000 © x 011 = 0 Blocks are 99%-satisfiable because of Dictators – these “encode” the m colors. Håstad’s Test Theorem: any f satisfying ¸ 51% of a block is noticeably correlated with O(1) coordinates ) “decodable” to O(1) Dictators/colors.

Thursday: When is a boolean function “linear”? And what is its Fourier expansion?