The Complexity of Matrix Completion Nick Harvey David
































- Slides: 32
The Complexity of Matrix Completion Nick Harvey David Karger Sergey Yekhanin
What is matrix completion? n Given matrix containing variables, substitute values for the variables to get full rank 1 1 x y x=1, y=0 1 1 1 0 1 1 x y x=1, y=1 1 1 Bad
Why should I care? Combinatorics n Many combinatorial problems relate to matrices of variables Problem Relation to Algebra Graph Matching Tutte ’ 47, Edmonds ’ 67, Lovasz ’ 79 Matroid Intersection Tomizawa-Iri ’ 74, Murota ’ 00 Counting paths in DAG God (i. e. , the BOOK) Gessel-Viennot ’ 85
Why should I care? Algorithms n Often yields highly efficient algorithms Problem Algorithms Graph Matching RNC: KUW’ 86, MVV’ 87 Sequential O(n 2. 38) time: MS’ 04, H’ 06 Matroid Intersection O(nr 1. 38) time: H’ 06 Counting paths in DAG Random Network Codes: Koetter-Medard ’ 03, Ho et al. ’ 03
Why should I care? Complexity n Depending on parameters, can be NP-complete, in RP, or in P ¨ Key parameters: Field size, # variables, # occurrences of each variable n Contains polynomial identity testing as special case (Valiant ’ 79) ¨ Derandomizing PIT implies strong circuit lower bounds (Kabanets-Impagliazzo ’ 03)
Field Size Why care about field size? ¨ Relevant to complexity: random works over large fields ¨ Understanding smaller fields may provide insight to derandomization ¨ Important for network coding efficiency (i. e. , complexity of routers)
Complexity Regions 9 Buss et al. ‘ 99 8 NP Hard Lovasz ‘ 79 7 6 # Occurences of an variable RP 5 4 3 2 H. , Karger, Murota ‘ 05 1 ? ? ? P 2 3 22 5 7 Geelen ‘ 99 P n+1 Field Size
Complexity Regions 9 8 NP Hard 7 NP Hard 6 # Occurences of an variable RP 5 4 3 2 P 1 2 3 22 P 5 7 n+1 Field Size
Variant: Simultaneous Completion n We have set of matrices A : = {A 1, …, Ad} ¨ Each variable appears at most once per matrix ¨ An variable can appear in several matrices Def: A simultaneous completion for A assigns values to variables while preserving the rank of all matrices n RP algorithm still works over large field n Application to Network Coding uses Simultaneous Completion
Relationship to Single Matrix Completion n Hardness for Simultaneous Completion Hardness for Single Matrix Completion w/many occurrences of variables 1 A B C 1 A D E 1 B C D Simultaneous Completion Single Matrix Completion
Simultaneous Completion Algorithm n n Simple self-reducibility algorithm Operates over field Fq, where d : = # matrices < q Non-trivial! Murota ’ 93. Input: d matrices Compute rank of all matrices Pick an variable x for i {0, …, d} Set x : = i If all matrices have unchanged rank Recurse (# variables has decreased)
A Sharp Threshold n n Simple self-reducibility algorithm Operates over field Fq, where d : = # matrices < q Thm: Simultaneous completion for d matrices over Fq is: ¨ in P if q > d [HKM ’ 05] ¨ NP-hard if q ≤ d [This paper]
A Sharp Threshold Thm: Simultaneous completion for d matrices over Fq is: ¨ in P if q > d [HKM ’ 05] ¨ NP-hard if q ≤ d [This paper] Cor: Single matrix completion with d occurrences of variables over Fq is NP-hard if q ≤ d
Approach n Reduction from Circuit-SAT A B C NAND C= (A B) (if A, B, C {0, 1}) C = 1 - A ∙B det 1 B A C 0 (if A, B, C {0, 1})
What have we shown so far? n Simultaneous completion of an unbounded number of matrices over F 2 is NP-hard n Can we use fewer? ¨ Combine small matrices into huge matrix? ¨ Problem: Variables appear too many times ¨ Need to somehow make “copies” of a variable n Coming up next: ¨ completing two matrices over F 2 is NP-hard
A Curious Matrix 1 Rn : = 1 1 x 2 1 1 1 x 3 1 1 0 1 1 1 xn 1
A Curious Matrix Thm: det Rn = 1 x 1 Rn : = 1 1 x 2 1 1 1 x 3 1 1 0 1 1 1 xn 1
Linearity of Determinant 1 1 0 x 1 1 1 x 2 1 1 1 x 3 1 1 xn 1 det = det 1 1 1 1 1 -1 x 1 1 1 1 0 x 2 1 1 1 x 2 1 1 0 x 3 1 1 x 3 1 0 xn 1 xn 0 + det
Column Expansion det 1 1 1 1 1 -1 x 1 1 1 1 0 x 2 1 1 1 x 2 1 1 0 x 3 1 1 x 3 1 0 xn 1 xn 0 x 1 = (-1)n+1 det + det 1 1 1 x 2 1 1 x 3 1 xn =
det 1 1 1 1 1 x 2 1 1 1 x 3 1 1 xn 1
Schur Complement Identity 1 1 1 1 1 x 2 1 1 1 x 3 1 1 xn 1 det = det 1 1 ∙ 1 1 1 1 - x 1 1 x 2 1 1 x 3 1 xn
Applying Outer Product = det 1 1 1 ∙ 1 1 - 1 x 2 1 1 x 3 1 xn 1 = det x 1 1 x 1 1 1 x 2 1 1 1 x 3 1 1 xn
Finishing up = det 1 x 1 1 1 x 2 1 1 1 x 3 1 1 = 1 1 xn QED
Replicating Variables Corollary: If {x 1, x 2, …, xn} in {0, 1} then det Rn 0 xi = xj i, j Proof: det Rn = , which is arithmetization of xi i xi. i So either all variables true, or all false.
Replicating Variables Corollary: If {x 1, x 2, …, xn} in {0, 1} then det Rn 0 xi = xj i, j Consequence: over F 2, need only 2 matrices NAND A : = Rn B : = NAND Rn Rn
What have we shown so far? Simultaneous completion of: an unbounded number of matrices over F 2 is NP-hard ¨ two matrices over F 2 is NP-hard ¨ Next: ¨ q matrices over Fq is NP-hard
Handling Fields Fq n n n Previous gadgets only work if each x {0, 1}. How can we ensure this over Fq? Introduce q-2 auxiliary variables: x=x(1), x(2), …, x(q-1) Sufficient to enforce that: x(i) x(j) i, j and x(i) {0, 1} i 2 det 1 1 x(i) x(j) 0 etc.
Handling Fields Fq x(i) x(j) i, j and x(i) {0, 1} i 2 0 1 x(1) x(q-1) x(2) x(4) x(3) Edge indicates endpoints non-equal
Handling Fields Fq x(i) x(j) i, j and x(i) {0, 1} i 2 n n Pack these constraints into few matrices Each variable used once per matrix Amounts to edge-coloring From (Kn), conclude that q matrices suffice 0 1 x(1) x(q-1) x(2) x(4) x(3)
What have we shown so far? n Simultaneous completion of: an unbounded number of matrices over F 2 is NP-hard ¨ two matrices over F 2 is NP-hard ¨ q matrices over Fq is NP-hard ¨
Main Results Thm: A simultaneous completion for d matrices over Fq is NP-hard if q ≤ d Cor: Completion of single matrix, variables appearing d times is NP-hard if q ≤ d Cor: Completion of skew-symmetric matrix, variables appearing d times is NP-hard if q ≤ d
Open Questions Improved hardess results / algorithms for matrix completion? n Lower bounds / hardness for field size in network coding? n More combinatorial uses of matrix completion n