The TRIE Amihood Amir Labeled Trees Edge Labeled
The TRIE Amihood Amir
Labeled Trees Edge Labeled Tree: T=(V, E, ℓ) Where ℓ: V Σ, Σ is the alphabet. Example: Σ={A, B, C} A A B B C
Path String A path v 0, …, vi in an edge labeled tree defines the path string ℓ(v 0), …, ℓ(vi) of the labels of the vertices on the path. Example: Path: A B C Path string: AAB
Root Paths A root path v 0, …, vi in an edge labeled tree is a path that starts at the root, i. e. v 0 is the root of the tree. Root Path: Example: A Not Root Path: B A B C
Longest Common Prefix Let S=S[1], …, S[m] and T=T[1], …, T[n] be two strings over alphabet Σ. The Longest Common Prefix (LCP) of S and T is the string a[1], …, a[k] such that a[i]=S[i]=T[i], i=1, …, k and such that S[k+1]≠T[k+1]. Example: The LCP of ABCAABCDABCCC and ABCAABCDACACC is: ABCAABCDA
re. TRIEval We define a Trie T of n strings S 1 = S 1[1], …, S 1[m 1] S 2 = S 2[1], …, S 2[m 2] … Sn = Sn[1], …, Sn[mn] over alphabet Σ by induction on n as follows: Let Λ, $ є Σ.
re. TRIEval – base case For n=1: S 1 = S 1[1], …, S 1[m 1] The trie is: Λ S 1[1] . . . S 1[m 1] $
re. TRIEval – inductive case (1) Assume we have defined he trie Tn of n strings. The trie Tn+1 of the n+1 strings: S 1 = S 1[1], …, S 1[m 1] S 2 = S 2[1], …, S 2[m 2] … Sn = Sn[1], …, Sn[mn] Sn+1 = Sn+1[1], …, Sn+1[mn+1] Is defined as follows:
re. TRIEval - inductive case (2) Let Tn be the trie of the n strings S 1 = S 1[1], …, S 1[m 1] S 2 = S 2[1], …, S 2[m 2] … Sn = Sn[1], …, Sn[mn] And let a[1], …a[k] be the longest LCP(Sn+1, Si), i=1, …, n.
re. TRIEval – inductive case (3) Concatenate the path: To the node where the root path of string a[1], …, a[k] ends. The resulting tree is Tn+1. Sn+1[k+1] . . . Sn+1[mn+1] $
Trie construction Example ABCABC ABBA ABCB BBAB BABC
Trie construction Time For a Trie T of n strings: S 1 = S 1[1], …, S 1[m 1] S 2 = S 2[1], …, S 2[m 2] … Sn = Sn[1], …, Sn[mn] Over fixed finite alphabet Σ:
Trie Insertion, Lookup, Deletion Time For string: S = S[1], …, S[m] Over fixed finite alphabet Σ: O(m) Over ubounded alphabet Σ: O(m log n)
How do we deal with numbers? An n-digit number is the string composed of the digits. Insertion/deletion/lookup time of number m: O(log m) Compare with AVL: O(log n)
- Slides: 14