Improved Rectangular Matrix Multiplication using Powers of the

  • Slides: 25
Download presentation
Improved Rectangular Matrix Multiplication using Powers of the Coppersmith-Winograd Tensor François Le Gall Kyoto

Improved Rectangular Matrix Multiplication using Powers of the Coppersmith-Winograd Tensor François Le Gall Kyoto University Florent Urrutia IRIF, Université Paris Diderot WACT 2018

Overview of the Result ü Improvements of the complexity of square matrix multiplication have

Overview of the Result ü Improvements of the complexity of square matrix multiplication have been obtained in the past few years by analyzing powers of a construction called “the Coppersmith-Winograd tensor” ü The best known algorithms for rectangular (non-square) matrix multiplication are based on the second power of the Coppersmith-Winograd tensor Our result We improve the complexity of rectangular (non-square) matrix multiplication by analyzing the fourth power of the Coppersmith-Winograd tensor

Square Matrix Multiplication Compute the product of two n x n matrices A and

Square Matrix Multiplication Compute the product of two n x n matrices A and B over a field n aij n × bij n = n cij n n n multiplications and (n-1) additions for all 1 ≤ i ≤ n and 1 ≤ j ≤ n ü One of the most fundamental problems in mathematics and computer science ü Trivial algorithm: n 2(2 n-1)=O(n 3) arithmetic operations

Exponent of Square Matrix Multiplication Compute the product of two n x n matrices

Exponent of Square Matrix Multiplication Compute the product of two n x n matrices A and B over a field Exponent of square matrix multiplication ω this product can be computed using arithmetic operations “the product can be computed using O(nω+ε) arithmetic operations for any ε>0” trivial algorithm: O(n 3) arithmetic operations ω ≤ 3 trivial lower bound: ω ≥ 2

History of the main improvements on the exponent of square matrix multiplication Upper bound

History of the main improvements on the exponent of square matrix multiplication Upper bound Year Authors ω ≤ 3 ω < 2. 81 1969 Strassen ω < 2. 79 1979 Pan ω < 2. 78 1979 Bini, Capovani, Romani and Lotti ω < 2. 55 1981 Schönhage ω < 2. 53 1981 Pan ω < 2. 52 1982 Romani ω < 2. 50 1982 Coppersmith and Winograd ω < 2. 48 1986 Strassen ω < 2. 376 1987 Coppersmith and Winograd ω < 2. 373 2010~2014 Stothers, Vassilevska Williams, LG Some of these bounds have been recovered using a group theoretic approach [Cohn, Umans 2003] [Cohn, Kleinberg, Szegedy, Umans 2005] [Cohn, Umans 2013]

How are these upper bounds obtained? Find a basic construction (of constant size) with

How are these upper bounds obtained? Find a basic construction (of constant size) with small arithmetic complexity “tensor” “small rank” Apply some generic technique to derive an upper bound on ω Example 1: ω < 2. 81 [Strassen 1969] basic construction: the product of two 2 x 2 matrices can be computed with only 7 multiplications generic technique: recursion two n x n matrices can be multiplied with O(n 2. 81) operations Example 2: ω < 2. 55 [Schönhage 1981] basic construction: the direct sum of two small matrix products generic technique: the asymptotic sum inequality Example 3: ω < 2. 388 [Coppersmith and Winograd 1987] basic construction: the Coppersmith-Winograd tensor (CW tensor) generic technique: the laser method

How are these upper bounds obtained? Find a basic construction (of constant size) with

How are these upper bounds obtained? Find a basic construction (of constant size) with small arithmetic complexity “tensor” “small rank” Apply some generic technique to derive an upper bound on ω Example 4: ω < 2. 376 [Coppersmith and Winograd 1987] basic construction: the second tensor power of the CW tensor generic technique: the laser method Example 3: ω < 2. 388 [Coppersmith and Winograd 1987] basic construction: the Coppersmith-Winograd tensor (CW tensor) generic technique: the laser method

Higher powers of the CW tensor What about higher powers? Analyzing the third power

Higher powers of the CW tensor What about higher powers? Analyzing the third power was explicitly mentioned as an open problem by Coppersmith and Winograd in 1990 The third power does not (seem to) give any improvement But higher powers actually do!

Higher powers of the CW tensor m Upper bound Authors 1 ω < 2.

Higher powers of the CW tensor m Upper bound Authors 1 ω < 2. 3871900 Coppersmith and Winograd (1987) 2 ω < 2. 3754770 Coppersmith and Winograd (1987) 4 ω < 2. 3729269 Stothers (2010), Vassilevska Williams (2012) 8 ω < 2. 3729 Vassilevska Williams (2012) 16 ω < 2. 3728640 LG (2014) 32 ω < 2. 3728639 LG (2014) Can this analysis (for powers 64, 128, . . . ) converge to 2? No, the same analysis for these powers cannot show ω < 2. 372 [Ambainis, Filmus and LG, 2015] Similar limitations are known also for analyzing any power of the CW tensor [Ambainis, Filmus and LG, 2015] [Alman and Vassilevska Williams, 2018]

Overview of the Result ü Improvements of the complexity of square matrix multiplication have

Overview of the Result ü Improvements of the complexity of square matrix multiplication have been obtained in the past few years by analyzing powers of a construction called “the Coppersmith-Winograd tensor” ü The best known algorithms for rectangular (non-square) matrix multiplication are based on the second power of the Coppersmith-Winograd tensor Our result We improve the complexity of rectangular (non-square) matrix multiplication by analyzing the fourth power of the Coppersmith-Winograd tensor

Rectangular Matrix Multiplication Compute the product of an n x m matrix A by

Rectangular Matrix Multiplication Compute the product of an n x m matrix A by an m x n matrix B n aij m × bij n m = cij n n Ø Ø Ø linear algebra problems all-pair shortest paths problems dynamic computation of the transitive closure of a graph detecting directed cycles in a graph computational geometry (colored intersection searching)

Exponent of Rectangular Matrix Multiplication Compute the product of an n x nk matrix

Exponent of Rectangular Matrix Multiplication Compute the product of an n x nk matrix A and an nk x n matrix B for any fixed k≥ 0 Exponent of rectangular matrix multiplication ω(k) this product can be computed using arithmetic operations “the product can be computed using O(nω(k)+ε) arithmetic operations for any ε>0” trivial algorithm: O(n 2+k) arithmetic operations ω(k) ≤ 2 + k square matrices: ω(1) = ω ≤ 2. 38 trivial lower bounds: ω(k) ≥ 2 ω(k) ≥ 1+ k

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] ω(k) is a convex function upper

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] ω(k) is a convex function upper bounds on ω(k) 3 k ≤ ) k ω( 2. 38 + 2 2 0 1 k trivial algorithm: O(n 2+k) arithmetic operations ω(k) ≤ 2 + k square matrices: ω(1) = ω ≤ 2. 38 trivial lower bounds: ω(k) ≥ 2 ω(k) ≥ 1+ k

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] ω(k) is a convex function upper

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] ω(k) is a convex function upper bounds on ω(k) 3 k ≤ ) k ω( 2. 38 + 2 2 0 0. 172 0. 294 1 k [Coppersmith 1982]: ω(0. 172) = 2 The product of an n x n 0. 172 matrix by an n 0. 172 x n matrix can be computed using O(n 2+ε) arithmetic operations for any ε>0 [Coppersmith 1997]: ω(0. 294) = 2

Exponent of Rectangular Matrix Multiplication basic construction: the Coppersmith-Winograd tensor generic technique: the laser

Exponent of Rectangular Matrix Multiplication basic construction: the Coppersmith-Winograd tensor generic technique: the laser method applied in an asymmetric way [Coppersmith 1997]: ω(0. 294) = 2

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] upper bounds on ω(k) 2. 38

Exponent of Rectangular Matrix Multiplication Property [Lotti 83] upper bounds on ω(k) 2. 38 ω(k) is a convex function this line has been used in most applications of rectangular matrix multiplication curve obtained by doing the same analysis for any value of k 2 0 0. 172 0. 294 1 k [Coppersmith 1997]: ω(0. 294) = 2 (obtained from the analysis of the first power of the CW tensor) [Ke, Zeng, Han, Pan 2008]: ω(0. 5356) < 2. 0712 ω(0. 8) < 2. 2356 ω(2) < 3. 2699 (slightly improving a bound from [Huang, pan 1998]) obtained by a similar asymmetric analysis of the first power of the CW tensor

Exponent of Rectangular Matrix Multiplication upper bounds on ω(k) 2. 38 this line has

Exponent of Rectangular Matrix Multiplication upper bounds on ω(k) 2. 38 this line has been used in most applications of rectangular matrix multiplication curve obtained by doing the same analysis for any value of k 2 0 0. 172 0. 294 1 k α > 0. 294 [Coppersmith 1997]: ω(0. 294) = 2 (obtained from the analysis of the first power of the CW tensor) Dual exponent of matrix multiplication proving that α=1 is equivalent to proving that ω=2 [LG 2012]: ω(0. 302) = 2 α > 0. 302 from the analysis of the second power of the CW tensor

Exponent of Rectangular Matrix Multiplication Results from [LG 2012]: exactly the same bound ω

Exponent of Rectangular Matrix Multiplication Results from [LG 2012]: exactly the same bound ω < 2. 376 as the one obtained by Coppersmith and Winograd for square matrix multiplication curve of the same shape, but slightly below the previous curve [LG 2012]: ω(0. 302) = 2 α > 0. 302 from the analysis of the second power of the CW tensor

Overview of the Result ü Improvements of the complexity of square matrix multiplication have

Overview of the Result ü Improvements of the complexity of square matrix multiplication have been obtained in the past few years by analyzing powers of a construction called “the Coppersmith-Winograd tensor” ü The best known algorithms for rectangular (non-square) matrix multiplication are based on the second power of the Coppersmith-Winograd tensor Our result We improve the complexity of rectangular (non-square) matrix multiplication by analyzing the fourth power of the Coppersmith-Winograd tensor

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of the CW tensor: ω < 2. 3872 [CW 1987] α > 0. 294 [Coppersmith 1997] second power of the CW tensor: ω < 2. 3755 [CW 1987] α > 0. 302 [LG 2012] fourth power of the CW tensor: ω < 2. 3730 [Stothers 2010] [Vassilevska Williams 2012] α > 0. 313 [this work]

Higher Powers of the CW Tensor Analysis of the second power of the CW

Higher Powers of the CW Tensor Analysis of the second power of the CW tensor [LG 12] (slight) improvement everywhere α > 0. 313 Analysis of the fourth power of the CW tensor [this work] exactly the same bound as the one obtained by Stothers and Vassilevska Williams for square matrix multiplication

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of the CW tensor: ω < 2. 3872 [CW 1987] α > 0. 294 [Coppersmith 1997] improvement: 0. 008 second power of the CW tensor: ω < 2. 3755 [CW 1987] α > 0. 302 [LG 2012] improvement: 0. 011 fourth power of the CW tensor: ω < 2. 3730 [Stothers 2010] [Vassilevska Williams 2012] α > 0. 313 [this work] eighth power of the CW tensor: ω < 2. 3729 [Vassilevska Williams 2012] …. higher powers cannot show ω < 2. 372 [Ambainis, Filmus and LG, 2015] ? ? ? What will happen with higher powers? Can we show similar limitation results for α ?

Overview of our Approach for the Fourth Power ü Each term should be analyzed

Overview of our Approach for the Fourth Power ü Each term should be analyzed before applying the laser method ü A key observation is that the analysis can be done recursively done for the square case in [Stothers 2010], [Vassilevska Williams 2012] using the concept of “value of a term” For the rectangular case, an asymmetric version of this term-byterm analysis does not (seem to) give any improvement for α Our main technical contribution shows how to perform the recursive analysis on all the terms together this gives an improvement, after solving numerically a non-convex optimization problem

Conclusion ü We showed how to analyze the fourth power of the CW tensor

Conclusion ü We showed how to analyze the fourth power of the CW tensor in an asymmetric way ü This leads to improvements on the complexity of rectangular (non-square) matrix multiplication in particular, we obtain the new lower bound α > 0. 313 on the dual exponent of matrix multiplication ü Most pressing open questions: • study the eighth power this should be doable using our technique • prove limitations on the improvements achievable using powers of the CW tensor extend what has been done for the square case [Ambainis, Filmus and LG, 2015] [Alman and Vassilevska Williams, 2018]

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of

Higher Powers of the CW Tensor Dual exponent of matrix multiplication first power of the CW tensor: α > 0. 294 [Coppersmith 1997] ω < 2. 3872 [CW 1987] second power of the CW tensor: α > 0. 302 [LG 2012] ω < 2. 3755 [CW 1987] the gap increases fourth power of the CW tensor: ω < 2. 3730 [Stothers 2010] [Vassilevska Williams 2012] for any fixed value of k, the improvements on ω(k) seem to decrease similarly to the square case when analyzing successive powers since the curve has an horizontal asymptote at the lower bound on α, even small improvements on ω(k) can lead to significant improvements on α α > 0. 313 [this work] ω(k) k