Provable Nonconvex Optimization for ML Prateek Jain Microsoft
Provable Non-convex Optimization for ML Prateek Jain Microsoft Research India
Overview •
Our Results •
Foreground/Background Separation = + 4
Non-convexity of Low-rank manifold 0. 5 1 0 0 0 0 + 0. 5 0 0 1 0 0 = 0. 5 0 0 0 0
Projection onto set of Low-rank Matrices • = x x = x x
Convex-projections vs Non-convex Projections •
Low-rank Matrix Regression
Matrix Linear Regression •
Low-rank Matrix Estimation •
Restricted Isometry Property •
Approach 1: Trace-norm minimization •
Approach 2: Alternating Minimization 2 F
Approach 3: Projected Gradient based Methods •
Guarantees • [J. , Meka, Dhillon 2009]
Extensions •
Summary •
Low-rank Matrix Completion
Low-rank Matrix Completion • Task: Complete ratings matrix • Applications: recommendation systems, PCA with missing entries
Low-rank
Low-rank Matrix Completion • 1 2 1 4 1 0 0 0 2 0 0 0 1 0 0 4 0 0
Approach 1 •
Incoherence?
Alternating Minimization • If X has rank-k: n m k X U k
Initialization [JNS’ 13] • 0 3 0 2 5 0 0 0 2
Results [JNS’ 13] •
Proof Sketch •
Proof Sketch Power Method Term Error Term Tools: 1. Spectral gap of random graphs 2. Bernstein-type concentration bounds
Bernstein?
Power Method?
Approach 3: Singular Value Projection • 1 1 1 1 1 - 1 1 1 . 5 . 5 = 0 0 0 . 5 . 5
Guarantees •
Setting up the proof (Rank-one Case) •
Key Step 1 •
Key Step 2 •
Key Step 3 •
Guarantee for SVP • [Netrapalli, J. ’ 14]
Stagewise-SVP •
Guarantees • [Netrapalli, J. ’ 14]
Simulations
Summary •
Robust Principal Component Analysis 49
Principal Component Analysis • 50
PCA with Corruption? • 51
Sparse Corruptions? • 52
Robust PCA • 53
Foreground + Background Separation + = Original Video Background = Foreground + 54
Foreground + Background Separation • 55
Robust PCA • 56
Identifiability? 1 0 0 0 = 1 0 0 0 0 + 0 0 0 1 0 0 0 • 57
Existing Method •
Our Approach: Alternating Projections • Low-rank Matrices Sparse Matrices 59
Projection onto Low-rank Matrices • = x x = x x
Projection onto Sparse Matrices • 1 0. 22 0. 1 0. 01 0. 11 0. 02 1 0 0 . 9 0. 12 0 0 0 61
Non-convex RPCA • 62
Computation Time • 63
Results • [NUSAJ’ 14]
Remove Condition No. Dependence? • 1 st Stage 2 nd Stage Rank-2 Matrices Rank-1 Matrices Sparse Matrices 65
Result • [NUSAJ’ 14]
Missing Entries? • 67
Proof Technique •
A Novel Perturbation Lemma • 69
Proof Sketch (Rank-1 case) •
Proof Sketch • 71
Empirical Results (Synthetic Datasets)
Empirical Results Convex Method. Runtime: 1700 sec = + Non-Convex Method. Runtime: 70 sec = + 73
Summary • Robust PCA • Low-rank+Sparse Decomposition • Alternating Projection Method • Under standard assumptions • Linear rate of convergence • Computation time: Recovery in O(PCA), for constant rank matrices • Key analysis tool: a strong perturbation bound for SVD 74
Future Work •
Thanks!
- Slides: 76