Topic 25 Tries In 1959 Edward Fredkin recommended

  • Slides: 27
Download presentation
Topic 25 Tries “In 1959, (Edward) Fredkin recommended that BBN (Bolt, Beranek and Newman,

Topic 25 Tries “In 1959, (Edward) Fredkin recommended that BBN (Bolt, Beranek and Newman, now BBN Technologies) purchase the very first PDP-1 to support research projects at BBN. The PDP-1 came with no software whatsoever. Fredkin wrote a PDP-1 assembler called FRAP (Free of Rules Assembly Program); ” Tries were first described by René de la Briandais in File searching using variable length keys.

Clicker 1 8 How would you pronounce “Trie” A. “tree” B. “tri – ee”

Clicker 1 8 How would you pronounce “Trie” A. “tree” B. “tri – ee” C. “try” D. “tiara” E. something else CS 314 Tries 2

Tries aka Prefix Trees 8 Pronunciation: 8 From retrieval 8 Name coined by Computer

Tries aka Prefix Trees 8 Pronunciation: 8 From retrieval 8 Name coined by Computer Scientist Edward Fredkin 8 Retrieval so “tree” 8… but that is very confusing so most people pronounce it “try” CS 314 Tries 3

Predictive Text and Auto. Complete 8 Search engines and texting applications guess what you

Predictive Text and Auto. Complete 8 Search engines and texting applications guess what you want after typing only a few characters CS 314 Tries 4

Auto. Complete 8 So do other programs such as IDEs CS 314 Tries 5

Auto. Complete 8 So do other programs such as IDEs CS 314 Tries 5

Searching a Dictionary 8 How? 8 Could search a set for all values that

Searching a Dictionary 8 How? 8 Could search a set for all values that start with the given prefix. 8 Naively O(N) (search the whole data structure). 8 Could improve if possible to do a binary search for prefix and then localize search to that location. 8 May be more difficult if prefix is not actually in the set or dictionary CS 314 Tries 6

Tries 8 A general tree 8 Root node (or possible a list of root

Tries 8 A general tree 8 Root node (or possible a list of root nodes) 8 Nodes can have many children – not a binary tree 8 In simplest form each node stores a character and a data structure (list? ) to refer to its children 8 Stores all the words or phrases in a dictionary. 8 How? CS 314 Tries 7

René de la Briandais Original Paper CS 314 Tries 8

René de la Briandais Original Paper CS 314 Tries 8

? ? Picture of a Dinosaur CS 314 Tries 9

? ? Picture of a Dinosaur CS 314 Tries 9

Can CS 314 Tries 10

Can CS 314 Tries 10

Candy CS 314 Tries 11

Candy CS 314 Tries 11

Fox CS 314 Tries 12

Fox CS 314 Tries 12

Clicker 2 8 Is “fast” in the dictionary represented by this Trie? A. No

Clicker 2 8 Is “fast” in the dictionary represented by this Trie? A. No B. Yes C. It depends CS 314 Tries 13

Clicker 3 8 Is “fist” in the dictionary represented by this Trie? A. No

Clicker 3 8 Is “fist” in the dictionary represented by this Trie? A. No B. Yes C. It depends CS 314 Tries 14

Tries 8 Another example of a Trie 8 Each node stores: – A char

Tries 8 Another example of a Trie 8 Each node stores: – A char – A boolean indicating if the string ending at that node is a word – A list of children CS 314 Tries 15

Predictive Text and Auto. Complete 8 As characters are entered we descend the Trie

Predictive Text and Auto. Complete 8 As characters are entered we descend the Trie 8… and from the current node … 8… we can descend to terminators and leaves to see all possible words based on current prefix 8 b, e, e -> bee, been, bees CS 314 Tries 16

8 Stores words and phrases. Tries – other values possible, but typically Strings 8

8 Stores words and phrases. Tries – other values possible, but typically Strings 8 The whole word or phrase is not actually stored in a single node. 8… rather the path in the tree represents the word.

Implementing a Trie CS 314 Tries 18

Implementing a Trie CS 314 Tries 18

TNode Class 8 Basic implementation uses a Linked. List of TNode objects for children

TNode Class 8 Basic implementation uses a Linked. List of TNode objects for children 8 Other options? – Array. List? – Something more exotic? CS 314 Tries 19

Basic Operations 8 Adding a word to the Trie 8 Getting all words with

Basic Operations 8 Adding a word to the Trie 8 Getting all words with given prefix 8 Demo in IDE CS 314 Tries 20

Compressed Tries 8 Some words, especially long ones, lead to a chain of nodes

Compressed Tries 8 Some words, especially long ones, lead to a chain of nodes with single child, followed by single child: s b e a r i l l d u o y y e t l o l c k p

Compressed Trie 8 Reduce number of nodes, by having nodes store Strings 8 A

Compressed Trie 8 Reduce number of nodes, by having nodes store Strings 8 A chain of single child followed by single child (followed by single child … ) is compressed to a single node with that String 8 Does not have to be a chain that terminates in a leaf node – Can be an internal chain of nodes CS 314 Tries 22

Original, Uncompressed s b e a r CS 314 i l l d u

Original, Uncompressed s b e a r CS 314 i l l d u y s e t l o l y c p k Tries 23

Compressed Version s b e ar id ll ell u sy to ck p

Compressed Version s b e ar id ll ell u sy to ck p y 8 fewer nodes compared to uncompressed version s–t–o–c-k CS 314 Tries 24

Data Structures 8 Data structures we have studied – arrays, array based lists, linked

Data Structures 8 Data structures we have studied – arrays, array based lists, linked lists, maps, sets, stacks, queues, trees, binary search trees, graphs, hash tables, red-black trees, priority queues, heaps, tries 8 Most program languages have some built in data structures, native or library 8 Must be familiar with performance of data structures – best learned by implementing them yourself CS 314 Heaps 25

Data Structures 8 We have not covered every data structure Heaps http: //en. wikipedia.

Data Structures 8 We have not covered every data structure Heaps http: //en. wikipedia. org/wiki/List_of_data_structures

Data Structures 8 deque, b-trees, quad-trees, binary space partition trees, skip list, sparse matrix,

Data Structures 8 deque, b-trees, quad-trees, binary space partition trees, skip list, sparse matrix, union-find data structure, Bloom filters, AVL trees, 2 -3 -4 trees, and more! 8 Must be able to learn new and apply new data structures CS 314 Heaps 27