SPACE EFFICENCY OF SYNOPSIS CONSTRUCTION ALGORITHMS Sudipto Guha
- Slides: 24
SPACE EFFICENCY OF SYNOPSIS CONSTRUCTION ALGORITHMS Sudipto Guha UPENN 1
Synopses Given n input numbers, summarize the input using B numbers, minimizing some error. ¡ Examples ¡ l l l Histograms – piecewise constant repn. Wavelets – uses the wavelet basis Fourier, Bessel, SVD, what have you… Space efficiency in synopsis construction algorithms VLDB 2005 2
Why space efficiency ¡ ¡ “Interestingly, according to modern astronomers, space is finite. This is a very comforting thought – particularly for people who can never remember where they left things. ” Woody Allen. From a computational viewpoint however… Space efficiency in synopsis construction algorithms VLDB 2005 3
Space is the cruelest resource ¡ Resources l l ¡ ¡ Time : tweedle thumbs Access (stream): make more passes Program simply will not run – or if data is shifted to disk, will run quite slow(er). Further, if we had more space, maybe we can compute a better (more accurate) synopsis Space efficiency in synopsis construction algorithms VLDB 2005 4
Examples - I ¡ Histograms l l l Many error measures V-OPT, Jagadish etal, 1998 O(n 2 B) time O(n. B) space Only O(n) space at a time (working space) ¡ O(n 2 B 2) time and O(n) space ¡ l l Is that the best ? Here: O(n 2 B) time O(n) space. Space efficiency in synopsis construction algorithms VLDB 2005 5
Example - II ¡ (Haar) Wavelets l Orthonormal systems l For l 2 error store the largest B coeffs of input l Does not work for non l 2 l Find the best B coeffs to retain (note, restricted). Garofalakis & Kumar, 04 O(n 2 B log B) time O(n 2 B) space, but O(n. B) needed at a time (for l 1 ) l Here O(n) space, and O(n 2) time l Space efficiency in synopsis construction algorithms VLDB 2005 6
Example - III ¡ Extended Wavelets l l l Multiple measures Optimization is similar to Knapsack with choices. Previous best – ¡ ¡ ¡ Deligiannakis and Rossopoulos, 04, O(Mn(B+ log n)) time and space O(Mn. B), but needing O(n. M+MB) at a time Guha, Kim, Shim, 04, reduced space to O(BM+min {n. M, B 2}) Here, O(BM) space Space efficiency in synopsis construction algorithms VLDB 2005 7
What we will not talk about ¡ ¡ Approximation algorithms for histograms Range Query Histograms Basically improvement of a factor B in space across the board. B is not always small, specially when n is large Space efficiency in synopsis construction algorithms VLDB 2005 8
The main idea ¡ ¡ Can we solve using a non DP paradigm ? Well, divide & conquer … Small details – how do we divide ? Interaction l l l Does a small interaction partitioning exist ? How (much size) to represent it ? Ease of finding it (in the given representation) ? Space efficiency in synopsis construction algorithms VLDB 2005 9
A case study - Histograms ¡ ¡ Formally, given a signal X find a piecewise constant representation H with at most B pieces minimizing ||X-H||2 Consider one bucket. The mean is the best value. A natural DP … Space efficiency in synopsis construction algorithms VLDB 2005 10
The DP for histograms Err[i, b] = Error of approximating x 1, …, xi using b buckets For i=1 to n do For 2 to B do For j=1 to i-1 do Err[i, b] = min Err[i, b], Err[j, b-1] + error(j+1, i) B n Space efficiency in synopsis construction algorithms VLDB 2005 11
What if ¡ ¡ We could figure out what was the story at the middlepoint ! Two questions l l So what ? How ? (use a DP) Space efficiency in synopsis construction algorithms VLDB 2005 12
Wait a minute … ¡ We just replaced a DP by another and claimed something … !!! Exactly. The second DP needs only O(n) space. So as the conquer steps re-use/share the same space; the total space is O(n) too. The idea is to use divide and conquer; and use a (small) DP to find the divide step. Is it really that simple ? Space efficiency in synopsis construction algorithms VLDB 2005 13
The code Space efficiency in synopsis construction algorithms VLDB 2005 14
The end of working space ¡ ¡ If you can partition a problem using the working space – you can recompute the solution of the parts at a little extra cost. Working space = total space. Space efficiency in synopsis construction algorithms VLDB 2005 15
How much is little ? Space efficiency in synopsis construction algorithms VLDB 2005 16
Wavelets ¡ A set of vectors l l l {1, -1, 0, 0…}, {0, 0, 1, -1, 0, 0, …}, {0, 0, 1, -1, 0, 0}, {0, 0, 0, 1 -1} {1, 1, -1, 0, 0}, {0, 0, 1, 1, -1} {1, 1, -1, -1, -1}, {1, 1, 1} A natural multi-resolution Space efficiency in synopsis construction algorithms VLDB 2005 17
Wavelet Synopsis Construction ¡ Formally, given a signal X and the Haar basis { i} find a representation F= i zi i with at most B non-zero zi minimizing some error which a fn of X-F ¡ Restriction. Zi is either 0 or h X, i i ¡ Debate. Unrestricted or restricted. Omit. Space efficiency in synopsis construction algorithms VLDB 2005 18
Wavelets ¡ ¡ ¡ ||X-F||1 Long history Matias, Vitter Wang ’ 98 Garofalakis, Gibbons, ’ 02 Garofalakis, Kumar, ’ 04 State of the Art l l l O(n 2 B log B) time O(n 2 B) space O(n. B) working space ¡ Here O(n 2 log B) time O(n) space ¡ SEE ALSO NEXT TALK … Space efficiency in synopsis construction algorithms VLDB 2005 19
What happens to wavelets [GK 04] ? Space efficiency in synopsis construction algorithms VLDB 2005 20
Extensions Approximation Algorithms ¡ Range Query Histograms ¡ Extended Wavelets ¡ Space efficiency in synopsis construction algorithms VLDB 2005 21
Histograms ¡ Saves space across all algorithms except algorithms which extend to general error measure over streams Space efficiency in synopsis construction algorithms VLDB 2005 22
Range Query Same story ¡ Open Q: ¡ l faster algorithm obeying synopsis size Space efficiency in synopsis construction algorithms VLDB 2005 23
That’s all folks Space efficiency in synopsis construction algorithms VLDB 2005 24
- Isentropic efficiencies of steady-flow devices
- Arup guha
- Ucf arup guha
- Lagrange polynomials
- Arup guha rate my professor
- Arup guha
- Kennedy space center construction
- Ndc to screen space
- Space junk the space age began
- Camera space to world space
- Unscented trajectory chapter 5
- Joint space vs cartesian space
- Synopsis skabelon samfundsfag
- Theme of fahrenheit 451 part 3
- Kandinsky chaos control
- Fordybelsesområder dansk 9. klasse eksempel
- The ground gives way story
- Tell tale heart summary
- Mr nobody poem
- The deuce synopsis
- Summary of dr faustus by christopher marlowe
- Young frankenstein synopsis
- Frankenstein book synopsis
- Ignou project synopsis front page
- Research proposal synopsis