ITCS 6114 Universal Hashing Dynamic Order Statistics David

  • Slides: 27
Download presentation
ITCS 6114 Universal Hashing Dynamic Order Statistics David Luebke 1 10/31/2020

ITCS 6114 Universal Hashing Dynamic Order Statistics David Luebke 1 10/31/2020

Choosing A Hash Function ● Choosing the hash function well is crucial ■ Bad

Choosing A Hash Function ● Choosing the hash function well is crucial ■ Bad hash function puts all elements in same slot ■ A good hash function: ○ Should distribute keys uniformly into slots ○ Should not depend on patterns in the data ● We discussed three methods: ■ Division method ■ Multiplication method ■ Universal hashing David Luebke 2 10/31/2020

Universal Hashing ● When attempting to foil an malicious adversary, randomize the algorithm ●

Universal Hashing ● When attempting to foil an malicious adversary, randomize the algorithm ● Universal hashing: pick a hash function randomly when the algorithm begins (not upon every insert!) ■ Guarantees good performance on average, no matter what keys adversary chooses ■ Need a family of hash functions to choose from David Luebke 3 10/31/2020

Universal Hashing ● A family of hash functions is said to be universal if:

Universal Hashing ● A family of hash functions is said to be universal if: ■ With a random hash function from , the chance of a collision between x and y is exactly 1/m (x y) ● We can use this to get good expected performance: ■ Choose h from a universal family of hash functions ■ Hash n keys into a table of m slots, n m ■ Then the expected number of collisions involving a particular key x is less than 1 David Luebke 4 10/31/2020

A Universal Hash Function ● Choose table size m to be prime ● Decompose

A Universal Hash Function ● Choose table size m to be prime ● Decompose key x into r+1 bytes, so that x = {x 0, x 1, …, xr} ■ Only requirement is that max value of byte < m ■ Let a = {a 0, a 1, …, ar} denote a sequence of r+1 elements chosen randomly from {0, 1, …, m - 1} ■ Define corresponding hash function ha : ■ With this definition, has mr+1 members David Luebke 5 10/31/2020

A Universal Hash Function ● is a universal collection of hash functions (Theorem 12.

A Universal Hash Function ● is a universal collection of hash functions (Theorem 12. 4) ● How to use: ■ Pick r based on m and the range of keys in U ■ Pick a hash function by (randomly) picking the a’s ■ Use that hash function on all keys David Luebke 6 10/31/2020

Order Statistic Trees ● OS Trees augment red-black trees: ■ Associate a size field

Order Statistic Trees ● OS Trees augment red-black trees: ■ Associate a size field with each node in the tree ■ x->size records the size of subtree rooted at x, including x itself: M 8 C 5 P 2 A 1 F 3 Q 1 David Luebke H 1 7 10/31/2020

Selection On OS Trees M 8 C 5 P 2 A 1 F 3

Selection On OS Trees M 8 C 5 P 2 A 1 F 3 Q 1 D 1 How can we use this property to select the ith element of the set? David Luebke 8 10/31/2020

OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x;

OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 9 10/31/2020

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size +

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 10 M 8 C 5 A 1 P 2 F 3 D 1 Q 1 H 1 10/31/2020

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size +

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 11 M 8 i=5 r=6 C 5 A 1 P 2 F 3 D 1 Q 1 H 1 10/31/2020

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size +

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 12 M 8 C 5 A 1 i=5 r=6 i=5 r=2 P 2 F 3 D 1 Q 1 H 1 10/31/2020

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size +

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 13 M 8 C 5 A 1 i=5 r=2 F 3 D 1 i=5 r=6 P 2 i=3 r=2 Q 1 H 1 10/31/2020

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size +

OS-Select Example ● Example: show OS-Select(root, 5): OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } David Luebke 14 i=5 r=6 M 8 C 5 A 1 F 3 D 1 P 2 i=5 r=2 i=3 r=2 H 1 Q 1 i=1 r=1 10/31/2020

OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i ==

OS-Select: A Subtlety OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } Oops… ● What happens at the leaves? ● How can we deal elegantly with this? David Luebke 15 10/31/2020

OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x;

OS-Select(x, i) { r = x->left->size + 1; if (i == r) return x; else if (i < r) return OS-Select(x->left, i); else return OS-Select(x->right, i-r); } ● What will be the running time? David Luebke 16 10/31/2020

Determining The Rank Of An Element M 8 C 5 P 2 A 1

Determining The Rank Of An Element M 8 C 5 P 2 A 1 F 3 Q 1 D 1 H 1 What is the rank of this element? David Luebke 17 10/31/2020

Determining The Rank Of An Element M 8 C 5 P 2 A 1

Determining The Rank Of An Element M 8 C 5 P 2 A 1 F 3 Q 1 D 1 H 1 Of this one? Why? David Luebke 18 10/31/2020

Determining The Rank Of An Element M 8 C 5 P 2 A 1

Determining The Rank Of An Element M 8 C 5 P 2 A 1 F 3 Q 1 D 1 H 1 Of the root? What’s the pattern here? David Luebke 19 10/31/2020

Determining The Rank Of An Element M 8 C 5 P 2 A 1

Determining The Rank Of An Element M 8 C 5 P 2 A 1 F 3 Q 1 D 1 H 1 What about the rank of this element? David Luebke 20 10/31/2020

Determining The Rank Of An Element M 8 C 5 P 2 A 1

Determining The Rank Of An Element M 8 C 5 P 2 A 1 F 3 Q 1 D 1 H 1 This one? What’s the pattern here? David Luebke 21 10/31/2020

OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y !=

OS-Rank(T, x) { r = x->left->size + 1; y = x; while (y != T->root) if (y == y->p->right) r = r + y->p->left->size + 1; y = y->p; return r; } ● What will be the running time? David Luebke 22 10/31/2020

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time ● Next step: maintain sizes during Insert() and Delete() operations ■ How would we adjust the size fields during insertion on a plain binary search tree? David Luebke 23 10/31/2020

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time ● Next step: maintain sizes during Insert() and Delete() operations ■ How would we adjust the size fields during insertion on a plain binary search tree? ■ A: increment sizes of nodes traversed during search David Luebke 24 10/31/2020

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations

OS-Trees: Maintaining Sizes ● So we’ve shown that with subtree sizes, order statistic operations can be done in O(lg n) time ● Next step: maintain sizes during Insert() and Delete() operations ■ How would we adjust the size fields during insertion on a plain binary search tree? ■ A: increment sizes of nodes traversed during search ■ Why won’t this work on red-black trees? David Luebke 25 10/31/2020

Maintaining Size Through Rotation y 19 x 11 6 x 19 right. Rotate(y) 7

Maintaining Size Through Rotation y 19 x 11 6 x 19 right. Rotate(y) 7 left. Rotate(x) 4 y 12 6 4 7 ● Salient point: rotation invalidates only x and y ● Can recalculate their sizes in constant time ■ Why? David Luebke 26 10/31/2020

Augmenting Data Structures: Methodology ● Choose underlying data structure ■ ● Determine additional information

Augmenting Data Structures: Methodology ● Choose underlying data structure ■ ● Determine additional information to maintain ■ ● E. g. , subtree sizes Verify that information can be maintained for operations that modify the structure ■ ● E. g. , red-black trees E. g. , Insert(), Delete() (don’t forget rotations!) Develop new operations ■ David Luebke E. g. , OS-Rank(), OS-Select() 27 10/31/2020