Stochastic Streams Sample Complexity vs Space Complexity David
- Slides: 23
Stochastic Streams: Sample Complexity vs. Space Complexity David Woodruff IBM Almaden Joint work with Michael Crouch, Andrew Mc. Gregor, and Greg Valiant
Motivation • (Well-studied) Statistics question: how many samples from a distribution are needed to estimate a property of a distribution? • (Well-studied) Streaming question: for a given fixed stream of samples, how much space is needed to estimate a property of a distribution? • Our work: understand the tradeoff between the sample and space complexity
Model 4 3 7 3 1 1 2 … • Algorithm sees a stream of i. i. d. samples from a distribution • Algorithm only has 1 pass over the samples • Goal: understand the tradeoff between the number t of samples needed to solve a problem, versus the space s of the algorithm
Problems •
Talk Outline • Sample/Space Tradeoff for Collision Probability Estimation • Sample/Space Tradeoff for Deciding Connectivity • Sample/Space Tradeoff for Determining if a Subspace is Full Rank
Collision Probability •
Collision Probability Algorithm • Break the t samples into t/w contiguous groups of w samples 4 … 3 Group 1 • 7 … Group 2 3 1 … Group 3 1 …
Collision Probability Algorithm •
Collision Probability Lower Bound •
Collision Probability Lower Bound • … …
Collision Probability Lower Bound •
Talk Outline • Sample/Space Tradeoff for Collision Probability Estimation • Sample/Space Tradeoff for Deciding Connectivity • Sample/Space Tradeoff for Determining if a Subspace is Full Rank
Graph Connectivity • Given t independent edges chosen with replacement from graph G, decide if G is connected • Simulate a random walk starting at node 1 • Store current vertex • If see an edge not incident to the current vertex, discard it • Remember first node i which you haven’t seen. Finish when i > n
Graph Connectivity 2 Start at vertex 1 Current Vertex: 3 1 1 First Untouched Vertex: done 23 4 4 See IID Stream: {1, 4}, {2, 3}, {1, 4}, {3, 4}, {1, 2}, {2, 3}, {1, 2}, {3, 4} do nothing
The Loopy Graph •
Use More Space and Fewer Samples •
Space/Time Tradeoff for Connectivity [Feige] • x x Otherwise, suppose we are in phase 2 x x Will sample a vertex from each group of k vertices
Implementation in the IID Model •
Talk Outline • Sample/Space Tradeoff for Collision Probability Estimation • Sample/Space Tradeoff for Deciding Connectivity • Sample/Space Tradeoff for Determining if a Subspace is Full Rank
Determining if a Subspace Has Full Rank •
Statistical Query Framework •
Statistical Query Framework •
Conclusions • Studied space versus sample tradeoffs in the data stream model • Obtained tradeoffs for statistical, graph, and linear algebra problems • Open questions: tighten our bounds • General question: unify the techniques for the different problems
- Time and space complexity
- Space complexity bfs
- Space complexity bfs
- Image search
- Bfs time complexity
- Space complexity of insertion sort
- Sample complexity for finite hypothesis spaces
- Stochastic rounding
- Stochastic programming
- Stochastic process model
- Wan optimization tutorial
- Deterministic and stochastic inventory models
- Liabulities
- Stochastic vs dynamic
- Absorbing stochastic matrix
- Regressors meaning
- Non stochastic theory of aging
- A first course in stochastic processes
- Stochastic process introduction
- Stochastic progressive photon mapping
- Agent a chapter 2
- Discrete variable
- Gradient descent java
- Stochastic process modeling