Dynamic Data Structures Orthogonal Range Queries and Update

  • Slides: 31
Download presentation
Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis Ph. D Defense

Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis Ph. D Defense 23 September 2011 Konstantinos Tsakalidis

Κωνσταντίνος Τσακαλίδης 2000 -2006 B. Eng. Computer Engineering and Informatics Dpt. , University of

Κωνσταντίνος Τσακαλίδης 2000 -2006 B. Eng. Computer Engineering and Informatics Dpt. , University of Patras, Greece Sum. 2007 Intern Google Inc. , Mountain View, California, USA 2007 -2009 Ph. D. Student (Part A) MADALGO, Aarhus University, Denmark Sum. 2010 Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada 2009 -2011 Ph. D. Student (Part B) Konstantinos Tsakalidis 2

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09]

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries with Expected Doubly Logarithmic Time” § [ICDT ’ 10] “Efficient Processing of 3 -Sided Range Queries with Probabilistic Guarantees” § Dynamic Planar Orthogonal Range Maxima Reporting Queries § [ICALP ’ 11] “Dynamic Planar Range Maxima Queries” § Multi-Versioned Indexed Databases § [SODA ‘ 12] “Fully Persistent B-Trees” Konstantinos Tsakalidis 3

Databases and Geometry Name Age Salary Andreas 30 5. 500 Maria 38 29 6.

Databases and Geometry Name Age Salary Andreas 30 5. 500 Maria 38 29 6. 500 John 25 3. 000 Helen 34 Jacob 28 Phone … 5/2011 555 -2143 … 4. 000 1/2000 555 -1432 … 7. 000 11/1989 555 -1234 … D dimensions Age N points D at e Name e Phon … Salary Date 555 -4321 … Planar (D=2) 4/1998 555 -3214 … Euclidean Space 2/2010 Query Operation • Question about stored data Update Operation/Transaction • Insert/Delete Tuple • Change Value Konstantinos Tsakalidis 4

Models of Computation N Memory N M<N B Record Disk N/B words B O(1)

Models of Computation N Memory N M<N B Record Disk N/B words B O(1) Time O(1) fields I/O Operation M/B w bits/cell Pointer Machine word-RAM I/O Model Space #Occupied Records #Occupied Cells #Occupied Blocks Time #Arithmetic Operations +#Pointer Traversals #Arithmetic Operations +#cell READ/WRITEs #I/O Operations Konstantinos Tsakalidis [Aggarwal, Vitter ‘ 88] specialized database 5

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09]

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries with Expected Doubly Logarithmic Time” § [ICDT ’ 10] “Efficient Processing of 3 -Sided Range Queries with Probabilistic Guarantees” § Dynamic Planar Orthogonal Range Maxima Reporting Queries § [ICALP ’ 11] “Dynamic Planar Range Maxima Queries” § Multi-Versioned Indexed Databases § [SODA ‘ 12] “Fully Persistent B-Trees” Konstantinos Tsakalidis 6

Orthogonal Range Reporting Queries Employees Age Contour Query Report all points with: Salary >

Orthogonal Range Reporting Queries Employees Age Contour Query Report all points with: Salary > 1000 Dominance Query Report all points with: Salary > 1000 and Age > 35 35 3 -Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35 Salary 1000 2000 Konstantinos Tsakalidis 7

Average-Case Efficient Worst-Case Efficient Dynamic 3 -Sided Range Reporting I/O Model Pointer Machine word-RAM

Average-Case Efficient Worst-Case Efficient Dynamic 3 -Sided Range Reporting I/O Model Pointer Machine word-RAM Space Query. Time I/Os Update. Time I/Os Priority Search External Priority. Tree [Mc. Creight’ 85] Search Tree [Arge’ 99] [ICDT ’ 10] word-RAM Fusion Tree [ICDT ’ 10][Willard’ 00] [ICDT ’ 10] [Mortensen’ 06] [ISAAC ’ 09] I/O Model [ISAAC‘ 09] External Priority Search Tree [Arge’ 99] amo. Space Query Time Update Time Expected w. h. p. Query I/Os Expected w. h. p. Amortized Expected w. h. p. Expected Update amortized I/Os Expected w. h. p. Amortized Expected amo. Expected w. h. p. Expected Amortizedw. h. p. Expected w. h. p. X, X, Y: Y: μ-random X: smooth X: Y: smooth restricted Y: X: restricted smooth X: smooth Konstantinos Tsakalidis 8

Probabilistic Distributions § Unknown non-changing μ-Random probabilistic distribution § (f, g)-Smooth distribution § Not

Probabilistic Distributions § Unknown non-changing μ-Random probabilistic distribution § (f, g)-Smooth distribution § Not exceed a specific bound, no matter how small subinterval § Includes regular, uniform distributions § Any distribution is (f, Θ(n))-smooth Smooth § Restricted class of distributions § Few elements occur very often § Many elements occur rarely § Zipfian, Power Law Distributions Restricted Konstantinos Tsakalidis 9

Priority Search Tree [Mc. Creight’ 75] Pointer Machine Update: Move Up Maximum Y Space:

Priority Search Tree [Mc. Creight’ 75] Pointer Machine Update: Move Up Maximum Y Space: O(n) Update: O(log n) Konstantinos Tsakalidis 10

Query by X-Coordinate: logn + t Pointer Machine Subtrees. In. X(Paths) O(logn) Konstantinos Tsakalidis

Query by X-Coordinate: logn + t Pointer Machine Subtrees. In. X(Paths) O(logn) Konstantinos Tsakalidis 11

Query by Y-Coordinate: logn + t word-RAM Pointer Machine [Alstrup, Brodal, Rauhe ‘ 00]

Query by Y-Coordinate: logn + t word-RAM Pointer Machine [Alstrup, Brodal, Rauhe ‘ 00] Find next point 1 D Range Maximum Queries (Children) u to be reported in O(1) time u ul ur Konstantinos Tsakalidis 12

[ISAAC ‘ 09] word-RAM Update: O(log n) exp. amo. Query: O(log n+t) exp. w.

[ISAAC ‘ 09] word-RAM Update: O(log n) exp. amo. Query: O(log n+t) exp. w. h. p. Space: O(n) 7] ‘ 0 up r ho T n, o s rs e nd A [ Weighti=Θ(22 i) O( 1) ex pe ct ed am or RMQ tiz ed . w. h. p d e t c e ) exp n g o l O(log [Mehl horn, Tsaka Kapor lidis ’ 9 is et a 3, l. ’ 06] Konstantinos Tsakalidis 13

Average-Case Efficient Dynamic 3 -Sided Range Reporting word-RAM Space [ISAAC ’ 09] I/O Model

Average-Case Efficient Dynamic 3 -Sided Range Reporting word-RAM Space [ISAAC ’ 09] I/O Model Space Query Time Update Time Expected w. h. p. Expected amortized Query I/Os Update I/Os Expected w. h. p. Amortized Expected X: smooth [ISAAC‘ 09] Konstantinos Tsakalidis 14

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09]

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries with Expected Doubly Logarithmic Time” § [ICDT ’ 10] “Efficient Processing of 3 -Sided Range Queries with Probabilistic Guarantees” § Dynamic Planar Orthogonal Range Maxima Reporting Queries § [ICALP ’ 11] “Dynamic Planar Range Maxima Queries” § Multi-Versioned Indexed Databases § [SODA ‘ 12] “Fully Persistent B-Trees” Konstantinos Tsakalidis 15

Orthogonal Range MAXIMA Reporting Queries OR “Generalized Planar SKYLINE Operator” Age Employees Dominance Maxima

Orthogonal Range MAXIMA Reporting Queries OR “Generalized Planar SKYLINE Operator” Age Employees Dominance Maxima Queries Is NOT Interesting Points Dominated Oldest and Maximal Best. Point Payed yt yb yb Report all maximal points among points with x in [xl, +∞) and y in [yb, +∞) Contour Maxima Queries Report all maximal points among points with x in (-∞, xl] 3 -Sided Maxima Queries Dominates: Is “Above” Report all maximal points among points with x in [xl, xr] and y in [yb, +∞) y ybb 4 -Sided Maxima Queries xl xl xlr xl xr Salary Report all maximal points among points with x in [xl, xr] and y in [yb, yt] Konstantinos Tsakalidis 16

Worst-Case Efficient Dynamic Range MAXIMA Reporting Pointer Machine Insert Delete log 2 n+t logn(1+t)

Worst-Case Efficient Dynamic Range MAXIMA Reporting Pointer Machine Insert Delete log 2 n+t logn(1+t) logn log 2 n logn + t logn log 2 n logn Insert Delete Overmars, van Leeuwen ‘ 81 logn + t Frederickson, Rodger ‘ 90 logn + t Janardan ‘ 91 logn + t - Kapoor ‘ 00 logn + t amo. [ICALP ’ 11] logn + t word-RAM [ICALP ’ 11] Konstantinos Tsakalidis 17

Tournament Tree Pointer Machine Copy Up Maximum Y Y-Winning Paths Konstantinos Tsakalidis 18

Tournament Tree Pointer Machine Copy Up Maximum Y Y-Winning Paths Konstantinos Tsakalidis 18

Tournament Tree Pointer Machine u MAX(Right(u)) Find next point to be reported in O(1)

Tournament Tree Pointer Machine u MAX(Right(u)) Find next point to be reported in O(1) time Konstantinos Tsakalidis 19

3 -Sided Range Maxima Queries Pointer Machine Query Time: log n + t MAX(

3 -Sided Range Maxima Queries Pointer Machine Query Time: log n + t MAX( Subtrees(Paths)) O(logn) Konstantinos Tsakalidis 20

Update Operation Pointer Machine Previous Update: O(log 2 n) Konstantinos Tsakalidis 21

Update Operation Pointer Machine Previous Update: O(log 2 n) Konstantinos Tsakalidis 21

Update Operation Pointer Machine MAX(Right(u)) U MAX(Right(u. L)) UL UR [Sundar ‘ 89] MAX(Right(u.

Update Operation Pointer Machine MAX(Right(u)) U MAX(Right(u. L)) UL UR [Sundar ‘ 89] MAX(Right(u. R)) Priority Queue with Attrition O(1) time Konstantinos Tsakalidis 22

Update Operation Reco ack Rollb Space: O(n) Update: O(logn) nstr uct Pointer Machine Partially

Update Operation Reco ack Rollb Space: O(n) Update: O(logn) nstr uct Pointer Machine Partially Perstistent Priority Queue with Attrition [Brodal ‘ 96] [Driscol et al. ‘ 89] amortized case O(1) worst time, space overhead per update step Konstantinos Tsakalidis 23

[ICALP ‘ 11] [ICALP ’ 11] Space Pointer Machine n word-RAM n [ICALP ’

[ICALP ‘ 11] [ICALP ’ 11] Space Pointer Machine n word-RAM n [ICALP ’ 11] logn+t Space Pointer Machine nlogn log 2 n+t Insert Delete logn Insert Delete log 2 n Konstantinos Tsakalidis 24

Rectangular Visibility Queries Proximity Queries/Similarity Search (-∞, +∞) (+∞, +∞) 4 (-∞, -∞) x

Rectangular Visibility Queries Proximity Queries/Similarity Search (-∞, +∞) (+∞, +∞) 4 (-∞, -∞) x 4 -Sided Range Maxima Queries (+∞, -∞) Konstantinos Tsakalidis 25

Worst-Case Efficient 4 -Sided Range MAXIMA Reporting and Rectangular Visibility Queries Pointer Machine Space

Worst-Case Efficient 4 -Sided Range MAXIMA Reporting and Rectangular Visibility Queries Pointer Machine Space Insert Delete log 2 n log 3 n Overmars, Wood ‘ 88 nlogn log 2 n+t Overmars, Wood ‘ 88 nlogn log 2 n +t logn log 2 n [ICALP ’ 11] nlogn log 2 n+t log 2 n Konstantinos Tsakalidis 26

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09]

Overview § Dynamic Planar Orthogonal 3 -Sided Range Reporting Queries § [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries with Expected Doubly Logarithmic Time” § [ICDT ’ 10] “Efficient Processing of 3 -Sided Range Queries with Probabilistic Guarantees” § Dynamic Planar Orthogonal Range Maxima Reporting Queries § [ICALP ’ 11] “Dynamic Planar Range Maxima Queries” § Multi-Versioned Indexed Databases § [SODA ‘ 12] “Fully Persistent B-Trees” Konstantinos Tsakalidis 27

B-Trees [Bayer, Mc. Creight ‘ 72] Space: O(N/B) blocks Update: O(log. BN) I/Os Access:

B-Trees [Bayer, Mc. Creight ‘ 72] Space: O(N/B) blocks Update: O(log. BN) I/Os Access: O(log. BN) I/Os Name Age Salary … Andreas 30 5. 500 … Maria 38 6. 500 … John 25 3. 000 … Helen 34 4. 000 … Jacob 28 7. 000 … Multi-Versioned Databases Btrfs Indexed Database Data Platform Konstantinos Tsakalidis 28

Fully Persistent B-Trees I/O Model Space Query I/Os Update I/Os Amortized Lanka, Mays ‘

Fully Persistent B-Trees I/O Model Space Query I/Os Update I/Os Amortized Lanka, Mays ‘ 91 n/B (log. Bn + t/B)log. Bm log. Bn log. Bm [SODA ’ 12] n/B log. Bn + t/B log. Bn + log 2 B n elements in one version m update operations = #versions B block size Konstantinos Tsakalidis 29

[SODA ‘ 12] I/O-Efficient Full Persistence Incremental B-Trees § Interface of Primitive Operations §

[SODA ‘ 12] I/O-Efficient Full Persistence Incremental B-Trees § Interface of Primitive Operations § Lazy Updates § O(log. BN) READs § O(1) WRITEs that make O(1) changes to a block § § § READ WRITE § § ACCESS NEW_NODE NEW_VERSION § Input is a pointer-based Structure § Node occupies O(1) blocks § Node has indegree O(1) § O(1) I/O-Overhead per access to a block § O(log 2 B) I/O-Overhead per change to a block § [Driscol et al. ’ 89] Node-Splitting Method Result Space O(N/B) Query O(log. BN+t/B) I/Os Update O(log. BN + log 2 B) I/Os Konstantinos Tsakalidis 30

Tsakalidis K. , et al. [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries

Tsakalidis K. , et al. [ISAAC ‘ 09] “Dynamic 3 -Sided Planar Range Queries [ICDT ’ 10] with Expected Doubly Logarithmic Time” “Efficient Processing of 3 -Sided Range Queries with Probabilistic Guarantees” [ICALP ’ 11] “Dynamic Planar Range Maxima Queries” [SODA ‘ 12] “Fully Persistent B-Trees” Mange Tak Konstantinos Tsakalidis Ph. D. Student tsakalid@madalgo. au. dk Konstantinos Tsakalidis