Robustness in Unsupervised and Supervised Machine Learning Gautam

Robustness in Unsupervised and Supervised Machine Learning Gautam Kamath University of Waterloo May 12, 2020 University of Waterloo Algorithms and Complexity Seminar

Modern Challenge: Robustness How can we ensure machine learning systems are prepared for when the real world doesn’t match our models?

Classic Motivations • Model misspecification • Nature doesn’t sample from Gaussians • Truly i. i. d. samples are rare • Dirty datasets

Modern Motivations • Data poisoning and adversarial attacks • Current machine learning systems are surprisingly brittle Figure from [Goodfellow Shlens Szegedy ‘ 14]

Modern Motivations From [Gu Dolan-Gavitt Garg ‘ 17]

Modern Motivations From [Athalye Engstrom Ilyas Kwok ‘ 17]

Robustness • Can we develop algorithms which are provably robust to worst-case noise? Today: Provably robust and effective algorithms for parameter estimation and fundamental supervised learning tasks

Robust High-Dimensional Parameter Estimation Robust Estimators in High Dimensions without the Computational Intractability [Diakonikolas K Kane Li Moitra Stewart, FOCS ‘ 16]

Classic Parameter Estimation •

Robust Parameter Estimation Given corrupted samples from a 1 D Gaussian: + ideal model = corruptions can we accurately estimate its parameters? observed model

Corrupted Samples? •

Formal Problem Statement •

Robust Parameter Estimation Do the empirical mean and variance work? No! A single corrupted sample can arbitrarily corrupt estimates

Robust Parameter Estimation But median and interquartile range (IQR) work! Median as proxy for mean IQR as proxy for standard deviation IQR median

Robust Parameter Estimation •

What about in 100 dimensions? •

Compendium of approaches Approach Error guarantee Tukey Median [Tukey’ 75] Tukey Depth of a point: Min. # of data points on one side of a hyperplane through point Tukey Median of a dataset: Point with maximum Tukey depth (not necessarily in dataset!) Running time NP-hard

Compendium of approaches Approach Error guarantee Running time Tukey Median [Tukey’ 75] NP-hard Geometric Median Near-linear! [CLMPS’ 16]

Compendium of approaches Approach Error guarantee Running time Tukey Median [Tukey’ 75] NP-hard Geometric Median Near-linear! [CLMPS’ 16] Naïve Pruning Linear

Compendium of approaches Approach Error guarantee Running time Tukey Median [Tukey’ 75] NP-hard Geometric Median Near-linear! [CLMPS’ 16] Naïve Pruning Linear “Tournament”

Compendium of approaches Approach Error guarantee Running time Tukey Median [Tukey’ 75] NP-hard Geometric Median Near-linear! [CLMPS’ 16] Naïve Pruning Linear “Tournament” … … …

Price of robustness? • All known approaches for high-dimensional mean estimation either 1. Are computationally intractable in high dimensions; or 2. Lose accuracy factors which depend polynomially on the dimension • Equivalently: • Computational efficient estimators in high dimensions can only handle vanishing amounts of noise • Is robust estimation possible in high dimensions?

Main Result •

Why do naïve methods get stuck? Consider the following simple algorithm: Naïve pruning: Remove all the points which are obviously too far to be from the Gaussian, then take the empirical mean of the remaining points.

How far is too far? •

Corruptions in 2 dimensions

Corruptions in high dimensions •

Global corruptions? Idea: If the corruptions move the mean… They also shift the covariance matrix!

Filtering •

Unsupervised Learning Being Robust (in High Dimensions) Can Be Practical [Diakonikolas K Kane Li Moitra Stewart, ICML ‘ 17]

Does the filter “work”? • 90% Gaussian data, 10% adversarial noise • Isotropic Gaussian • Estimate mean • Estimate covariance • Skewed Gaussian • Estimate covariance

Synthetic Experiments, Unknown Mean Code: https: //github. com/hoonose/robust-filter

Synthetic Experiments, Unknown Covariance Code: https: //github. com/hoonose/robust-filter

Exploratory Data Analysis Being Robust (in High Dimensions) Can Be Practical [Diakonikolas K Kane Li Moitra Stewart, ICML ‘ 17]

Robust PCA • Our setting: incomparable with Robust PCA setting of Candes et al.

Gene Expression PCA Contains Europe • Genes Mirror Geography in Europe. [Novembre et al. ], Nature ‘ 08 Code: https: //github. com/hoonose/robust-filter

Naively, Corruptions Destroy Europe • Genes Mirror Geography in Europe. [Novembre et al. ], Nature ‘ 08 Code: https: //github. com/hoonose/robust-filter

Europe is RANSACked • Genes Mirror Geography in Europe. [Novembre et al. ], Nature ‘ 08 Code: https: //github. com/hoonose/robust-filter

Robust PCA SDPs couldn’t save them… • Genes Mirror Geography in Europe. [Novembre et al. ], Nature ‘ 08 Code: https: //github. com/hoonose/robust-filter

Our Algorithms Fix Europe! • Genes Mirror Geography in Europe. [Novembre et al. ], Nature ‘ 08 Code: https: //github. com/hoonose/robust-filter

Supervised Learning Sever: A Robust Meta-Algorithm for Stochastic Optimization [Diakonikolas K Kane Li Steinhardt Stewart ICML ‘ 19]

Beyond Robust Statistics •

Stochastic Optimization •

Sever: Robust Stochastic Optimization •

Sever •

Making it practical •

Experiments •

Ridge Regression Code: https: //github. com/hoonose/sever

SVMs, synthetic data Code: https: //github. com/hoonose/sever

SVMs, Enron dataset Code: https: //github. com/hoonose/sever

Conclusions • Methods for robust estimation • Applicable in many settings • Computationally efficient • Sample efficient • Accurate in high dimensions • Realizable! • Still a lot to explore…