Dictionaries CS 105 100205 Definition n The Dictionary

  • Slides: 30
Download presentation
Dictionaries CS 105 10/02/05

Dictionaries CS 105 10/02/05

Definition n The Dictionary Data Structure n n n Need an Entry interface/class n

Definition n The Dictionary Data Structure n n n Need an Entry interface/class n 10/02/05 structure that facilitates searching objects are stored with search keys; insertion of an object must include a key searching requires a key and returns the key-object pair removal also requires a key Entry encapsulates the key-object pair (just like with priority queues) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Sample Applications n An actual dictionary n n n Record keeping applications n n

Sample Applications n An actual dictionary n n n Record keeping applications n n 10/02/05 key: word object: word record (definition, pronunciation, etc. ) Bank account records (key: account number, object: holder and bank account info) Student records (key: id number, object: student info) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Dictionary Interface public interface Dictionary { public int size(); public boolean is. Empty(); public

Dictionary Interface public interface Dictionary { public int size(); public boolean is. Empty(); public Entry insert( int key, Object value ) throws Duplicate. Key. Exception; public Entry find( int key ); // return null if not found public Entry remove( int key ) // return null if not found; } 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Dictionary details/variations n Key types n n n Duplicate entries (entries with the same

Dictionary details/variations n Key types n n n Duplicate entries (entries with the same key) may be allowed n n 10/02/05 For simplicity, we assume that the keys are ints But the keys can be any kind of object as long as they can be ordered (e. g. , string and alphabetical ordering) Our textbook calls the data structure that does not allows duplicates a Map, while a Dictionary allows duplicates For purposes of this discussion, we assume that dictionaries do not allow duplicates Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Dictionary Implementations n n n 10/02/05 Unordered list (section 8. 3. 1) Ordered table

Dictionary Implementations n n n 10/02/05 Unordered list (section 8. 3. 1) Ordered table (section 8. 3. 3) Binary search tree (section 9. 1) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Unordered list n Strategy: store the entries in the order that they arrive n

Unordered list n Strategy: store the entries in the order that they arrive n n n Can use an array, Array. List, or linked list Find operation requires scanning the list until a matching key value is found n n Scanning implies an O( n ) operation Remove operation similar to find operation n n 10/02/05 O( 1 ) insert operation Entries need to be adjusted if using array/Array. List O( n ) operation Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Ordered table n n Idea: if the list was ordered by key, searching is

Ordered table n n Idea: if the list was ordered by key, searching is simpler/easier Just like for priority queues, insertion is slightly more complex n n n 10/02/05 Need to search for proper position of element -> O( n ) Find: don’t do a linear scan; instead, do a binary search Note: use array/Array. List; not a linked list Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary search n n Take advantage of the fact that the elements are ordered

Binary search n n Take advantage of the fact that the elements are ordered Compare the target key with middle element to reduce the search space in half Repeat the process until the element is found or search space reduces to 1 Arithmetic on array indexes facilitate easy computation of middle position n n 10/02/05 Middle of S[low] and S[high] is S[(low+high)/2] Not possible with linked lists Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary Search Algorithm Binary. Search( S, k, low, high ) array of Entries target

Binary Search Algorithm Binary. Search( S, k, low, high ) array of Entries target key if low > high then return null; // not found else mid (low+high)/2 Binary. Search( S, some. Key, 0, size-1 ); e S[mid]; if k = e. get. Key() then return e; else if k < e. get. Key() then return Binary. Search( S, k, low, mid-1 ) else return Binary. Search( S, k, mid+1, high ) 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary Search Algorithm 2 low 4 5 7 8 9 12 14 17 19

Binary Search Algorithm 2 low 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37 mid high find(22) mid = (low+high)/2 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37 low mid high find(22) mid = (low+high)/2 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37 low mid high find(22) mid = (low+high)/2 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22

Binary Search Algorithm 2 4 5 7 8 9 12 14 17 19 22 25 27 28 33 37 low=mid=high find(22) mid = (low+high)/2 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Time complexity of binary search n n Search space reduces by half until it

Time complexity of binary search n n Search space reduces by half until it becomes 1 n -> n/2 -> n/4 -> … -> 1 n n 10/02/05 log n steps Find operation using binary search is O( log n ) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Time complexity Operation insert() find() remove() Unsorted List O( 1 ) O( n )

Time complexity Operation insert() find() remove() Unsorted List O( 1 ) O( n ) Ordered Table O( n ) O( log n ) 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved O(n ) L 7: Dictionaries Slide 10/02/05

Binary Search Tree (BST) n n Strategy: store entries as nodes in a tree

Binary Search Tree (BST) n n Strategy: store entries as nodes in a tree such that an inorder traversal of the entries would list them in increasing order Search, remove, and insert are all O( log n ) operations n 10/02/05 All operations require a search that mimics binary search: go to left or right subtree depending on target key value Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Traversing a BST n n Insert, remove, and find operations all require a key

Traversing a BST n n Insert, remove, and find operations all require a key First step involves checking for a matching key in the tree n n 10/02/05 Start with the root, go to left or right child depending on key value Repeat the process until key is found or a null child is encountered (not found) For insert operation, duplicate key error occurs if key already exists Operation is proportional to height of tree ( usually O(log n ) ) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Insertion in BST (insert 78) 44 88 17 32 28 65 97 54 29

Insertion in BST (insert 78) 44 88 17 32 28 65 97 54 29 82 76 80 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Insertion in BST 44 88 17 32 28 65 97 54 29 82 76

Insertion in BST 44 88 17 32 28 65 97 54 29 82 76 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 1) 44 17 w 32 28 88 Remove 32 65

Removal from BST (Ex. 1) 44 17 w 32 28 88 Remove 32 65 z 97 54 29 82 76 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 1) 44 17 88 w 32 28 65 z 97

Removal from BST (Ex. 1) 44 17 88 w 32 28 65 z 97 54 29 82 76 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 1) 44 88 17 28 65 29 97 54 82

Removal from BST (Ex. 1) 44 88 17 28 65 29 97 54 82 76 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 2) 44 17 Remove 65 32 28 88 w 65

Removal from BST (Ex. 2) 44 17 Remove 65 32 28 88 w 65 97 54 29 82 76 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 2) 44 88 17 w 32 28 54 65 97

Removal from BST (Ex. 2) 44 88 17 w 32 28 54 65 97 82 y 76 29 x 80 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Removal from BST (Ex. 2) 44 88 17 w 32 28 65 76 97

Removal from BST (Ex. 2) 44 88 17 w 32 28 65 76 97 54 82 80 29 78 10/02/05 Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Time complexity for BSTs n n n 10/02/05 O( log n ) operations not

Time complexity for BSTs n n n 10/02/05 O( log n ) operations not guaranteed since resulting tree is not necessarily “balanced” If tree is excessively skewed, operations would be O( n ) since the structure degenerates to a list Tree could be periodically reordered to prevent skewedness Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Time complexity (average case) Operation insert() find() remove() Unsorted List O( 1 ) O(

Time complexity (average case) Operation insert() find() remove() Unsorted List O( 1 ) O( n ) Ordered Table O( n ) O( log n ) O(n ) BST 10/02/05 O( log n ) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

Time complexity (worst case) 10/02/05 Operation insert() find() remove() Unsorted List O( 1 )

Time complexity (worst case) 10/02/05 Operation insert() find() remove() Unsorted List O( 1 ) O( n ) Ordered Table O( n ) O( log n ) O(n ) BST O( n ) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05

About BSTs n AVL tree: BST that “self-balances” n n n Many efficient searching

About BSTs n AVL tree: BST that “self-balances” n n n Many efficient searching methods are variants of binary search trees n 10/02/05 Ensures that after every operation, the difference between the left subtree height and the right subtree height is at most 1 O( log n ) operation is guaranteed Database indexes are B-trees (number of children > 2, but the same principles apply) Copyright 2005, by the authors of these slides, and Ateneo de Manila University. All rights reserved L 7: Dictionaries Slide 10/02/05