Loco Distributing Ridge Regression with Random Projections Yang
Loco: Distributing Ridge Regression with Random Projections Yang Song Department of Statistics
Introduction • In the last few years there has been great interest in solving large-scale optimization and estimation problems. • Some datasets are large enough such that they are impractical to store and process on a single machine and so the problem must be solved in a distributed manner on a computing cluster. • Two obvious questions: • 1. Distribution. • 2. Communication.
Ridge Regression • Linear Regression: • To minimize the error: • OLS: • Ridge Regression:
Distributed ridge regression •
A low-dimensional approximation
Subsampled Randomized Hadamard Transform (SRHT)
Algorithm
Computational, memory and communication costs. The cost of computing random projection in each block The memory cost :
Benefits of Loco • The problem each worker solves becomes easier in a computational sense. • Each local problem becomes easier in a statistical sense. • the size of the random projections to be communicated by each worker decreases.
Analysis • Is the coefficients estimated by Loco are close to the full ridge regression solution? • Risk: • Natural assumption: • Most of the important signal lies in the direction of the first J principal components of X.
Assumption
Theorem 1
Experimental Results • n = 4, 000 • p = 150, 000 • Rank r = 150 • n_test = 1000 • Within-block correlation : 0. 7 • Signal-to-noise ratio: 1 • Loco 1, Loco 5, Loco 10
• n = 8, 000 • p = 500, 000 • Rank r = 500 • Loco 1, Loco 2
Climate data • The data we consider is part of the CMIP 5 climate modeling ensemble, specically the data are taken from control simulations of the GISS global circulation model. • p = 10368 • n = 1062 • n_test: 213, n_train: 849
Conclusion • In the case of p>>n, we should use ridge regression rather than linear regression. • Loco is a distributed algorithm that decrease the cost of time and memory much but with a low additional prediction error • Loco can be generalized to a larger class of estimation problems.
Thank You!
- Slides: 24