Canonical depththree Boolean circuits for multilinear functions multilinear

Canonical depth-three Boolean circuits for multi-linear functions, multi-linear circuits with general ML gates, and matrix rigidity Oded Goldreich Weizmann Institute of Science Based on joint works with Avi Wigderson and Avishay Tal (see ECCC TR 13 -043 and TR 15 -079, resp. ) Avi. Fest, Oct 2016

Constant Depth Boolean Circuits Parityn requires depth d circuits of size exp( (n 1/(d-1))), and this is tight. Famous frontier: Stronger circuit models. Another frontier: Stronger lower bounds (i. e. , exp( (n))). This will be our focus here. Suggestion 1: Consider multi-linear (e. g. , bilinear) functions. Suggestion 2: Focus on depth three. Suggestion 3: Study a restricted class of (canonical) Boolean circuits, which corresponds to (ML) Arithmetic circuits with general (ML) gates. About the latter model: sanity checks, connection to matrix rigidity, an explicit trilinear function that is harder than parity.

Suggestion 1: Consider multilinear functions Recall: Depth d circuits for Parityn have size exp( (n 1/(d-1))). We seek Stronger lower bounds (i. e. , exp( (n))). Multi-linear functions : x=(x(1), …, x(t)), x(i) 0, 1 n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Think of t=2, … log n associated with tensor T [n]t The complexity of computing F is supposed to arise from the “complexity” of the tensor. Conj (sanity check): For every t>1, there exists a t-linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1…]

Suggestion 2: Focus on depth three Conj (1 st sanity check): For every t>1, there exists a t -linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1] Goal (assuming conj. ): For every t>1, present an explicit t-linear function that requires depth -three circuits of size exp( (tnt/(t+1))). [holds for t=1] A 2 nd sanity check: Consider a restricted model of (depth-three) circuits, and prove the L. B. in it. t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t)

Suggestion 3: Canonical Boolean circuits and Multi. Linear circuits with general gates Recall (the 2 nd sanity check): Consider a restricted model of (depth-three) circuits, and prove the L. B. in it. Motivation: the standard construction of depth-three circuit for n-way Parity. CNF of size exp(sqrt(n)). Parsqrt(n) . . sqrt(n) DNFs of size exp(sqrt(n)). In general, use arbitrary ML gates of bounded arity: A gate of arity a is implemented by a CNF/DNF of size exp(a). A depth-two ML circuit with m such gates, yields a (canonical) depth-three Boolean circuit of size m exp(a).

Suggestion 3: Canonical Boolean circuits and Multi. Linear circuits with general gates In general, use arbitrary ML gates of bounded arity: A gate of arity a is implemented by a CNF/DNF of size exp(a). An ML circuit (of arbitrary depth) with m such gates, yields a (canonical) depth-three Boolean circuit of size exp(m+a). Goal: For every t>1, present an explicit t-linear function that requires canonical (depth-three) circuits of size exp( (tnt/(t+1))). Def (AN-complexity): The complexity of a (ML) circuit (of arbitrary depth) with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates. E. g. , Parity has AN-complexity (sqrt(n)), and we wish to bypass this. Goal (restated): For every t>1, present an explicit t-linear function that require AN-complexity (tnt/(t+1)). (Holds for t=1. )

Are canonical Boolean circuits as powerful as general depth three circuits (wrt computing multilinear function)? Oded: I believe so. Avi: I don’t know. In any case, it is begging to prove lower bounds for it (equiv. for AN-complexity): Firstly, because lower bounds on the size of depth-three circuits for ML functions require lower bounds on AN-complexity, and secondly because such lower bounds exist (i. e. , hold for non-explicit functions) and are seemingly within reach.

On the AN-complexity of Multi. Linear functions Thm 3: Explicit 3 -linear and 4 -linear functions of AN-complexity (n 0. 6) and (n 0. 666), resp. Def (AN-complexity): The complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates. Goal: For every t>1, present an explicit t-linear function that require AN-complexity (tnt/(t+1)).

Proofs of Thm 1&2 t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Decompose the tensor T into m=O(tnt/(t+1)) sub-tensors each having side of length nt/(t+1). Compute each tensor by a single gate of arity m, and use a top gate to compute their sum. A counting argument: The number of t-ML circuits of ANcomplexity m is dominated by (exp(mt))m=mt+1. Def (AN-complexity): The complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates.

Proof of Thm 3 t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Thm 3: Explicit 3 -linear and 4 -linear functions of AN-complexity (n 0. 6) and (n 0. 666), resp. Super Structured Lem 3. 1: If a bilinear function has AN-complexity m, then the corresponding matrix does not have (ss)-rigidity m 3 for rank m. Lem 3. 2: A random Toeplitz matrix M has rigidity (n 1. 8) for rank n 0. 6. Use F(x, y) = (i, j) T xiyj = i, j Mi, jxiyj Lem 3. 3: A “pseudorandom” matrix M has ss-rigidity (n 1. 999) for rank n 0. 666. Use F(x, y) = i, j Mi, jxiyj for a bilinear form M. Def (AN-complexity): The complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates. Goal: For every t>1, present an explicit t-linear function that require AN-complexity (tnt/(t+1)).

AN-complexity of 2 -linear fnc and matrix rigidity Lem 3. 1: If a bilinear function has AN-complexity m, then the corresponding matrix does not have (super-structured) rigidity m 3 for rank m. Lem 3. 2: A random Toeplitz matrix M has rigidity (n 1. 8) for rank n 0. 6. Use F(x, y) = (i, j) T xiyj = i, j Mi, jxiyj Lem 3. 3: A “pseudorandom” matrix M has super-structured rigidity (n 1. 999) for rank n 0. 666. Use F(x, y) = i, j Mi, jxiyj for a bilinear form M. Def 1: A matrix M does not have rigidity s for rank r if M=S+R such that S has at most s ones (is s-sparse) and R has rank r. Def 2: M does not have structured rigidity s for rank r if M=S+R as in Def 1 with the ones of S residing in s rectangles of side-length s. Def 3: M does not have super-structured rigidity s for rank r if M=S+R as in Def 2 such that R is spanned by s-sparse rows and columns. Note: SS-rigidity is equivalent to AN-complexity Def: The AN-complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates. t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t)

The notions of (matrix) rigidity Def 1: A matrix M does not have rigidity s for rank r if M=S+R such that S has at most s ones (is s-sparse) and R has rank r. Def 2: M does not have structured rigidity s for rank r if M=S+R as in Def 1 with the ones of S residing in s rectangles of side-length s. Def 3: M does not have super-structured rigidity s for rank r if M=S+R as in Def 2 such that R is spanned by s-sparse rows and columns. Def 1: non-rigid = sparse + low-rank. Def 2: non-structured rigidity = structured-sparse + low-rank, where structure-sparse = sum of few matrices such that in each matrix the 1’s are confined to small rectangles. Def 3: non-super-structured rigidity = structured-sparse + low-rankspanned-by-sparse-row + low-rank-spanned-by-sparse-columns. Note: Super-Structured-rigidity is equivalent to AN-complexity Def: The AN-complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates.

Open problems Def (AN-complexity): The complexity of a circuit with arbitrary ML gates equals the maximum between the arity of its gates and the number of gates. Thm 3: Explicit 3 -linear and 4 -linear functions of AN-complexity (n 0. 6) and (n 0. 666), resp. Goal: For every t>1, present an explicit t-linear function that require AN-complexity (tnt/(t+1)). Open problem: Explicit 2 -linear functions of AN-complexity (n 0. 51), then (n 0. 666). Open problem: Explicit O(1)-linear functions of AN-complexity (n 0. 667). Then, (n 0. 999). Probably w. o. using the rigidity connection Conj (1 st sanity check): For every t>1, there exists a t -linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1]

END Slides available at http: //www. wisdom. weizmann. ac. il/~oded/T/avi-kk. pptx Papers available at http: //www. wisdom. weizmann. ac. il/~oded/p_kk. html http: //www. wisdom. weizmann. ac. il/~oded/p_rigid. html

OLD SLIDES (for st 1 paper) Slides available at http: //www. wisdom. weizmann. ac. il/~oded/T/kk. pptx Paper available at http: //www. wisdom. weizmann. ac. il/~oded/p_kk. html

Constant Depth Boolean Circuits Parityn requires depth d circuits of size exp( (n 1/(d-1))). Famous frontier: Stronger circuit models. Another frontier: Stronger lower bounds (i. e. , exp( (n))). Multi-linear functions : x=(x(1), …, x(t)), x(i) 0, 1 n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Think of t=2, … log n associated with tensor T [n]t Conj (sanity check): For every t>1, there exists a t-linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1…]

The Program* t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Conj (1 st sanity check): For every t>1, there exists a t -linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1] Goal: For every t>1, present an explicit t-linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1] A 2 nd sanity check: Consider a restricted model of (depth-three) circuits, and prove the L. B. in it. *) Taking advantage of Avi’s absence.

Arithmetic Circuits with General Gates Motivation: Depth-three Boolean Circuits for Parityn are obtained by implementing a sqrt(n)-way sum of sqrt(n)way sums. In general, depth-three BC are obtained via depth-two AC with general ML-gates. We get depth-three BC for F of size exponential in C 2(F) Model: Depth-two (set-)multi-linear circuits with arbitrary (set-)multi-linear gates. Complexity measure (C 2) = the (max. ) arity of a gate. Recall: We use a fix partition of the variables, and multi-linear means being linear in each variable-block. Depth-three BC obtained this way are restricted in (1) their structure arising from direct composition, and (2) ML gates.

Arithmetic Circuits with General Gates (cont. ) Model: Unbounded-depth (set-)multi-linear circuits with arbitrary (set-)multi-linear gates. Complexity measure (C) = max(arity, #gates). PROP: Every ML function F has a depth-three BC of size exp(O(C(F)). PF: guess & verify. THM: There exist bilinear functions F such that C(F)=sqrt(n) but C 2(F)= (n 2/3). OBS: For every t-linear F, Ct+1(F) ≤ 2 C(F).

Arith. Circuits with General Gates: Results Model: Unbounded-depth (set-)multi-linear circuits with arbitrary (set-)multi-linear gates. Complexity measure (C) = max(arity, #gates); C 2 for depth-two. THM 1: There exist bilinear functions F such that C(F)=sqrt(n) but C 2(F)= (n 2/3). THM 2: For every t-linear function F it holds that C(F) ≤ C 2(F) = O(tnt/(t+1)). THM 3: Almost all t-linear functions F satisfy C 2(F) ≥ C(F) = (tnt/(t+1)). Open: An explicit function as in Thm 3; for starters (tn 0. 51).

Arith. Circuits with General Gates: Results (cont. ) Model: Unbounded-depth (set-)multi-linear circuits with arbitrary (set-)multi-linear gates. Complexity measure (C) = max(arity, #gates); C 2 for depth-two. Open: An explicit function as in Thm 3; for starters (tn 0. 51). An approach (a candidate): The 3 -linear function assoc. with tensor T= (i, j, k): |i-(n/2)|+|j-(n/2)|+|k-(n/2)|≤n/2. PROP: The complexity of the above 3 -linear function is lower bounded by the maximum complexity of all bilinear functions associated w. Toeplitz matrices. THM: If matrix M has rigidity m 3 for rank m, then the corresponding bilinear function has complexity (m). Note: A restricted notion of (“structured”) rigidity suffices. Open: Show that Toeplitz matrix w. rigidity n 1. 51 for rank n 0. 51.

Comments on the proofs Model: Multi-linear circuits with arbitrary multi-linear gates. Complexity measure (C) = max(arity, #gates); C 2 for depth-two. THM 1: There exist bilinear functions F such that C(F)=sqrt(n) but C 2(F)= (n 2/3). PF idea: s=sqrt(n), THM 2: For every t-linear function F it holds that C(F) ≤ C 2(F) = O(tnt/(t+1)). PF: Covering by m cubes of side m. THM 3: Almost all t-linear functions F satisfy C 2(F) ≥ C(F) = (tnt/(t+1)). PF: A counting argument. f(x, y)=g(x, L 1(y), …, Ls(y)). THM 4: If matrix M has rigidity m 3 for rank m, then the corresponding bilinear function has complexity (m). PF idea: The m linear function yield a rank m matrix, whereas the m quadratic forms (in variables) cover m 3 entries.

Add’l comments on the proof of THM 1 Model: Multi-linear circuits with arbitrary multi-linear gates. Complexity measure (C) = max(arity, #gates); C 2 for depth-two. THM 1: There exist bilinear functions F such that C(F)=sqrt(n) but C 2(F)= (n 2/3). PF: For s=sqrt(n), let f(x, y)=g(x, L 1(y), …, Ls(y)), where g is generic (over n+s bits), each Li computes the sum of s variables in y. A generic depth-two ML circuit of complexity m computes f as B(F 1(x), …, Fm(x), G 1(y), …, Gm(y)) + i [m]Bi(x, y) where the Bi’s are quadratic and each function has arity m. Hitting y with a random restriction that leaves one variable alive in each block, we get B(F 1(x), …, Fm(x), G’ 1(y’), …, G’m(y’)) + i [m]B’i(x, y’) where each B’I (and G’I) depends on O(m/s) variables. Hence, the description length is O(m 3/s) ; cf. to ns=n 2/s.

Structured Rigidity DEF: Matrix M has (m 1, m 2, m 3)-structured rigidity for rank r if matrix R of rank r the non-zeros of M-R cannot be covered by m 1 (gen. ) m 2 -by-m 3 rectangles. Rigidity m 1 m 2 m 3 implies (m 1, m 2, m 3) structured rigidity for the same rank, but not vice versa. THM 5: There exist matrices of (m, m, m)-structured rigidity for rank m that do not have rigidity 3 mn for rank 0 (let alone for rank m). For every m [n 0. 51, n 0. 66]. PF: Consider a random matrix with 3 mn one-entries. THM 4’: If matrix M has (m, m, m)-structured rigidity for rank m, then the corresponding bilinear function has complexity (m). PF idea: The proof of Thm 4 goes through w. o. any change.

Summary t-linear functions x=(x(1), …, x(t)), |x(i)|=n F(x(1), …, x(t)) = (i_1, …, i_t) T xi_1(1) xi_t(t) Long-term/dream goal: For every t>1, present an explicit t-linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1] Conj (1 st sanity check): For every t>1, there exists a t -linear function that requires depth-three circuits of size exp( (tnt/(t+1))). [holds for t=1] 2 nd sanity check: Prove L. B. for a restricted model of (depththree) circuits; specifically, Arithm. Ckts with general gates: Current goal: Show that explicit t-linear functions F satisfy C 2(F) ≥ C(F) = (tnt/(t+1)) or so (i. e. super-sqrt).