A General Characterization of Statistical Query Complexity Vitaly

  • Slides: 15
Download presentation
A General Characterization of Statistical Query Complexity Vitaly Feldman

A General Characterization of Statistical Query Complexity Vitaly Feldman

Sparse aces ing ter us P es aram tim e at ter io n

Sparse aces ing ter us P es aram tim e at ter io n Cl O SC PC sp sion A f l ha egres r Lear ning j ian uss a G unta res tu mix DNF s We are working on it! In the mean time try a reduction … What is the computational HELP!!! complexity? 2

Statistical query (SQ) oracle Sparse aces ing ter us P es aram tim e

Statistical query (SQ) oracle Sparse aces ing ter us P es aram tim e at ter io n Cl O SC PC sp sion A f l ha egres r Lear ning j ian uss a G unta s res tu mix DNF HELP!!! Convex opt , Boosting Moment matching decision trees, SVM 3

Thm: For any problem its SQ complexity can be nearly tightly characterized by a

Thm: For any problem its SQ complexity can be nearly tightly characterized by a “simple” dimension Sparse aces ing ter us P es aram tim e at ter io n Cl O SC PC sp sion A f l ha egres r Lear ning j ian uss a G unta s res tu mix DNF SQ dimension? Convex opt , Boosting Moment matching decision trees, SVM 4

Statistical problems

Statistical problems

Statistical query model [Kearns ‘ 93] SQ algorithm Known aliases: • Counting query •

Statistical query model [Kearns ‘ 93] SQ algorithm Known aliases: • Counting query • Linear statistical functional/estimator

SQ applications • Noise-tolerant learning [Kearns 93; BFKV 96; F, Balcan 13] • Differentially

SQ applications • Noise-tolerant learning [Kearns 93; BFKV 96; F, Balcan 13] • Differentially private data analysis [Dinur, Nissim 03; BDMN 05; DMNS 06] o Local model [KLNRS 08] • Distributed/low communication/streaming ML o Theory [Ben-David, Dichterman 98; BBFM 12; FGRVX 13; Steinhardt, G. Valiant, Wager 15] o Practice [CKLYBNO 06; RSKSW 10; SLB+ 11; ACDL 14 …] • Evolvability …] [F 08; F 09; Kanade, Wortman, L. Valiant 10; Kanade 11; P. Valiant 11; • Adaptive data analysis [DFHPRR 14; Hardt, Ullman 14; Steinke, Ullman 15; F, Steinke 17; …] 7

SQ complexity SQ equivalents • • • Decision trees/lists [Kearns 93] Linear thresholds [BFKV

SQ complexity SQ equivalents • • • Decision trees/lists [Kearns 93] Linear thresholds [BFKV 96; Dunagan, Vempala 01; F, Guzman, Vempala 15] Method of moments Boosting [Aslam, Decatur 93] Stochastic convex optimization [F, Guzman, Vempala 15] Possible to analyze and prove lower bounds! 8

SQ dimension 9

SQ dimension 9

Main result SQ lower bounds [FGRVX 13; F, Perkins, Vempala 15]

Main result SQ lower bounds [FGRVX 13; F, Perkins, Vempala 15]

Decision problems 12

Decision problems 12

(Almost) General case • 13

(Almost) General case • 13

Applications • 14

Applications • 14

Conclusions • SQ is a restricted yet powerful model of data access ü An

Conclusions • SQ is a restricted yet powerful model of data access ü An algebraic parameter tightly characterizes the (randomized) SQ complexity • Open problems: o Simpler versions for specific problems (e. g. PAC learning) o Analysis techniques o SQ complexity of specific problems 15