Sketching 1 Alex Andoni Columbia University MADALGO Summer
Sketching (1) Alex Andoni (Columbia University) MADALGO Summer School on Streaming Algorithms 2015
131. 107. 65. 14 Challenge: log statistics of the data, using small space 18. 0. 1. 12 131. 107. 65. 14 IP Frequency 131. 107. 65. 1 4 3 18. 0. 1. 12 2 80. 97. 56. 20 2 127. 0. 0. 1 9 192. 168. 0. 1 8 257. 2. 5. 7 0 80. 97. 56. 20 18. 0. 1. 12 80. 97. 56. 20 131. 107. 65. 14
Streaming statistics � IP Frequency 131. 107. 65. 1 4 3 18. 0. 1. 12 2 80. 97. 56. 20 2
2 nd frequency moment �
Correctness �
Proof [sketch] �
2 nd frequency moment: overall �
More efficient sketches? �
Streaming Scenario 2 131. 107. 65. 14 80. 97. 56. 20 18. 0. 1. 12 IP Frequency 131. 107. 65. 1 4 1 18. 0. 1. 12 2 80. 97. 56. 20 1 Similar Qs: average delay/variance in a network differential statistics between logs at different servers, etc
Definition: Sketching � IP Frequency 131. 107. 65. 1 1 4 18. 0. 1. 12 2 010110 010101 IP Frequency 131. 107. 65. 14 1 18. 0. 1. 12 1 80. 97. 56. 20 1
� IP Frequency 131. 107. 65. 1 1 4 18. 0. 1. 12 2 010110 010101 IP Frequency 131. 107. 65. 14 1 18. 0. 1. 12 1 80. 97. 56. 20 1
A task: estimate sum � a 3 a 1 a 2 a 3 a 4
Precision Sampling Framework � u 1 a 1 u 2 a 2 u 3 a 3 u 4 a 4
Formalization Sum Estimator Adversary �
Precision Sampling Lemma [A-Krauthgamer-Onak’ 11] �
Precision Sampling Algorithm �
Correctness (cont) �
- Slides: 24