Binary Search Trees CMPT 225 Objectives Understand tree
Binary Search Trees CMPT 225
Objectives Understand tree terminology Understand implement tree traversals Define the binary search tree property Implement binary search trees Implement the Tree. Sort algorithm October 2004 John Edgar 2
Trees
Trees A set of nodes (or vertices) with a single starting point called the root Each node is connected by an edge to another node A tree is a connected graph There is a path to every node in the tree A tree has one fewer edges than the number of nodes October 2004 John Edgar 4
Is it a Tree? NO! yes! All the nodes are not connected yes! (but not a binary tree) NO! There is an extra edge (5 nodes and 5 edges) October 2004 John Edgar yes! (it’s actually the same graph as the blue one) 5
Tree Relationships Node v is said to be a child of u, and u the parent of v if root A There is an edge between the nodes u and v, and u is above v in the tree, edge This relationship can be generalized B C parent of B, C, D D E and F are descendants of A D and A are ancestors of G E F G B, C and D are siblings F and G are? October 2004 John Edgar 6
More Tree Terminology A A leaf is a node with no children path is a sequence of nodes v 1 … vn where vi is a parent of vi+1 (1 i n-1) A subtree is any node in the tree along with all of its descendants A binary tree is a tree with at most two children per node The children are referred to as left and right We can also refer to left and right subtrees October 2004 John Edgar 7
Tree Terminology Example A C B subtree rooted at B D leaves: E October 2004 path from A to D to G John Edgar F C, E, F, G G 8
Binary Tree Terminology A B left subtree of A D C right child of A E F G right subtree of C H October 2004 John Edgar I J 9
Measuring Trees The height of a node v is the length of the longest path from v to a leaf The height of the tree is the height of the root The depth of a node v is the length of the path from v to the root This is also referred to as the level of a node Note that there is a slightly different formulation of the height of a tree Where the height of a tree is said to be the number of different levels of nodes in the tree (including the root) October 2004 John Edgar 10
Height of a Binary Tree A height of the tree is ? 3 height of node B is ? B C level 1 2 D H October 2004 John Edgar E depth of node E is ? 2 F G I level 2 J level 3 11
Beautiful Trees
Is it a Tree (II)? yes! However, these trees are not “beautiful” (for some applications) October 2004 John Edgar 13
Perfect Binary Trees A binary tree is perfect, if A No node has only one child And all the leaves have the same depth perfect binary tree of height h has how many nodes? B A D C E F G 2 h+1 – 1 nodes, of which 2 h are leaves October 2004 John Edgar 14
Height of a Perfect Tree l Each level doubles the number of nodes l Level 1 has 2 nodes (21) l Level 2 has 4 nodes (22) or 2 times the number in Level 1 l Therefore a tree with h levels has 2 h+1 - 1 nodes l The root level has 1 node 01 11 12 21 31 October 2004 22 32 the bottom level has 2 h nodes, that is, just over ½ the nodes are leaves 33 John Edgar 23 34 35 24 36 37 38 15
Complete Binary Trees A binary tree is complete if A The leaves are on at most two different levels, The second to bottom level is completely filled in, and The leaves on the bottom level are as far to the left as possible B D C E F Perfect trees are also complete October 2004 John Edgar 16
Balanced Binary Trees A binary tree is balanced if Leaves are all about the same distance from the root The exact specification varies Sometimes trees are balanced by comparing the height of nodes e. g. the height of a node’s right subtree is at most one different from the height of its left subtree Sometimes a tree's height is compared to the number of nodes e. g. red-black trees October 2004 John Edgar 17
Balanced Binary Trees A A B D B C E F D C E F G October 2004 John Edgar 18
Unbalanced Binary Trees A A B C B E D C D F October 2004 John Edgar 19
Tree Traversals
Binary Tree Traversals A traversal algorithm for a binary tree visits each node in the tree Typically, it will do something while visiting each node! Traversal algorithms are naturally recursive There are three traversal methods Inorder Preorder Postorder October 2004 John Edgar 21
In. Order Traversal Algorithm C++ // In. Order traversal algorithm void in. Order(Node *n) { if (n != 0) { in. Order(n->left. Child); visit(n); in. Order(n->right. Child); } } October 2004 John Edgar 22
In. Order Traversal A B D October 2004 John Edgar C E F 23
Pre. Order Traversal Algorithm C++ // Pre. Order traversal algorithm void pre. Order(Node *n) { if (n != 0) { visit(n); pre. Order(n->left. Child); pre. Order(n->right. Child); } } October 2004 John Edgar 24
Pre. Order Traversal visit(n) 1 pre. Order(n->left. Child) 17 pre. Order(n->right. Child) 2 3 9 13 visit pre. Order(l) pre. Order(r) 5 16 visit pre. Order(l) pre. Order(r) 4 October 2004 11 6 7 20 visit pre. Order(l) pre. Order(r) 27 visit pre. Order(l) pre. Order(r) 8 39 visit pre. Order(l) pre. Order(r) John Edgar 25
Post. Order Traversal Algorithm C++ // Post. Order traversal algorithm void post. Order(Node *n) { if (n != 0) { post. Order(n->left. Child); post. Order(n->right. Child); visit(n); } } October 2004 John Edgar 26
Post. Order Traversal post. Order(n->left. Child) 8 post. Order(n->right. Child) 17 visit(n) 4 2 9 13 post. Order(l) post. Order(r) visit 3 16 post. Order(l) post. Order(r) visit 1 7 5 20 post. Order(l) post. Order(r) visit 27 post. Order(l) post. Order(r) visit 6 39 post. Order(l) post. Order(r) visit 11 post. Order(l) post. Order(r) visit October 2004 John Edgar 27
Binary Tree Implementation and Binary Search Trees
Binary Tree Implementation The binary tree can be implemented using a number of data structures Reference structures (similar to linked lists) Arrays We will look at three implementations Binary search trees (reference / pointers) Red – black trees (reference / pointers) Heap (arrays) October 2004 John Edgar 29
Problem: Accessing Sorted Data Consider maintaining data in some order The data is to be frequently searched on the sort key e. g. a dictionary Possible solutions might be: A sorted array ▪ Access in O(logn) using binary search ▪ Insertion and deletion in linear time An ordered linked list ▪ Access, insertion and deletion in linear time Neither of these is efficient October 2004 John Edgar 30
Dictionary Operations The data structure should be able to perform all these operations efficiently Create an empty dictionary Insert Delete Look up The insert, delete and look up operations should be performed in at most O(logn) time October 2004 John Edgar 31
Binary Search Tree Property A binary search tree (BST) is a binary tree with a special property For all nodes in the tree: ▪ All nodes in a left subtree have labels less than the label of the node ▪ All nodes in a right subtree have labels greater than or equal to the label of the node Binary October 2004 search trees are fully ordered John Edgar 32
BST Example 17 13 9 27 16 20 39 11 October 2004 John Edgar 33
BST In. Order Traversal in. Order(n->left. Child) 5 visit(n) 17 in. Order(n->right. Child) 3 1 9 in. Order(l) visit in. Order(r) 13 in. Order(l) visit in. Order(r) 4 16 in. Order(l) visit in. Order(r) 2 October 2004 11 An inorder traversal retrieves the data in sorted order 7 6 20 in. Order(l) visit in. Order(r) 27 in. Order(l) visit in. Order(r) 8 39 in. Order(l) visit in. Order(r) John Edgar 34
BST Implementation Binary search trees can be implemented using a reference structure Tree nodes contain data and two pointers to nodes Node *left. Child data Node *right. Child data to be stored in the tree pointers to Nodes October 2004 John Edgar 35
BST Search To find a value in a BST search from the root node: If the target is less than the value in the node search its left subtree If the target is greater than the value in the node search its right subtree Otherwise return true, or return data, etc. How many comparisons? One for each node on the path Worst case: height of the tree + 1 October 2004 John Edgar 36
BST Insertion The BST property must hold after insertion Therefore the new node must be inserted in the correct position This position is found by performing a search If the search ends at the (null) left child of a node make its left child refer to the new node If the search ends at the right child of a node make its right child refer to the new node The cost is about the same as the cost for the search algorithm, O(height) October 2004 John Edgar 37
BST Insertion Example insert 43 create new node find position insert new node 47 32 19 43 10 7 October 2004 63 41 23 12 John Edgar 37 30 54 44 43 53 79 59 57 96 91 97 38
BST Deletion After deletion the BST property must hold Deletion is not as straightforward as search or insertion So much so that sometimes it is not even implemented! Deleted nodes are marked as deleted in some way There a number of different cases that must be considered October 2004 John Edgar 39
BST Deletion Cases The October 2004 node to be deleted has no children node to be deleted has one child node to be deleted has two children John Edgar 40
BST Deletion Cases The node to be deleted has no children Remove it (assigning null to its parent’s reference) October 2004 John Edgar 41
BST Deletion – target is a leaf delete 30 47 32 63 19 10 7 October 2004 41 23 12 37 30 John Edgar 54 44 53 79 59 57 96 91 97 42
BST Deletion Cases The node to be deleted has one child Replace the node with its subtree October 2004 John Edgar 43
BST Deletion – target has one child delete 79 replace with subtree 47 32 63 19 10 7 October 2004 41 23 12 37 30 John Edgar 54 44 53 79 59 57 96 91 97 44
BST Deletion – target has one child delete 79 after deletion 47 32 63 19 10 7 October 2004 41 23 12 37 30 John Edgar 54 44 53 59 57 96 91 97 45
BST Deletion Cases The node to be deleted has two children Replace the node with its successor, the left most node of its right subtree ▪ It is also possible to replace the node with its predecessor, the right most node of its left subtree If that node has a child (and it can have at most one child) attach it to the node’s parent ▪ Why can a predecessor or successor have at most one child? October 2004 John Edgar 46
BST Deletion – target has 2 children delete 32 find successor and detach 47 32 63 19 10 41 23 37 54 44 53 79 59 96 temp 7 October 2004 12 30 John Edgar 57 91 97 47
BST Deletion – target has 2 children delete 32 find successor attach target node’s children to 32 37 successor 19 10 47 63 temp 41 23 37 54 44 53 79 59 96 temp 7 October 2004 12 30 John Edgar 57 91 97 48
BST Deletion – target has 2 children delete 32 - find successor - attach target’s children to 32 successor - make successor child of target’s 19 parent 10 7 October 2004 47 37 23 12 63 temp 41 54 44 30 John Edgar 53 79 59 57 96 91 97 49
BST Deletion – target has 2 children delete 32 note: successor had no subtree 47 37 63 temp 19 10 7 October 2004 41 23 12 54 44 30 John Edgar 53 79 59 57 96 91 97 50
BST Deletion – target has 2 children delete 63 - find predecessor*: note it has a subtree 47 32 63 19 41 54 *predecessor used instead of successor to show its location - an implementation would have to pick one or the other 79 temp 10 7 October 2004 23 12 37 30 John Edgar 44 53 59 57 96 91 97 51
BST Deletion – target has 2 children delete 63 - find predecessor - attach predecessor’s subtree to its 32 parent 19 47 63 41 54 79 temp 10 7 October 2004 23 12 37 30 John Edgar 44 53 59 57 96 91 97 52
BST Deletion – target has 2 children delete 63 - find predecessor - attach subtree - attach target’s 32 children to predecessor 47 temp 63 19 41 54 59 79 temp 10 7 October 2004 23 12 37 30 John Edgar 44 53 59 57 96 91 97 53
BST Deletion – target has 2 children delete 63 - find predecessor - attach subtree - attach children 32 - attach pre. to target’s parent 19 10 7 October 2004 23 12 47 temp 63 41 37 30 John Edgar 54 44 59 79 53 96 57 91 97 54
BST Deletion – target has 2 children delete 63 47 32 59 19 10 7 October 2004 41 23 12 37 30 John Edgar 54 44 79 53 96 57 91 97 55
BST Efficiency The efficiency of BST operations depends on the height of the tree All three operations (search, insert and delete) are O(height) If the tree is complete the height is log(n) What if it isn’t complete? October 2004 John Edgar 56
Height of a BST Insert 7 Insert 4 Insert 1 Insert 9 Insert 5 It’s a complete October 2004 7 4 tree! John Edgar 9 height = log(5) = 2 1 5 57
Height of a BST Insert 9 Insert 1 Insert 7 Insert 4 Insert 5 It’s a linked list with a lot of extra pointers! 9 1 7 height = n – 1 = 4 = O(n) 4 5 October 2004 John Edgar 58
Balanced BSTs It would be ideal if a BST was always close to complete i. e. balanced How do we guarantee a balanced BST? We have to make the insertion and deletion algorithms more complex ▪ e. g. red – black trees. October 2004 John Edgar 59
Sorting and Binary Search Trees It is possible to sort an array using a binary search tree Insert the array items into an empty tree Write the data from the tree back into the array using an In. Order traversal Running time = n*(insertion cost) + Insertion cost is O(h) Traversal is O(n) Total = O(n) * O(h) + O(n), i. e. O(n * h) If the tree is balanced = O(n * log(n)) October 2004 John Edgar traversal 60
Tree Quiz
Tree Quiz I Write a recursive function to print the items in a BST in descending order class Node { public: int data; Node *leftc; Node *rightc; }; October 2004 John Edgar 62
Tree Quiz II Write a recursive function to delete a BST stored in dynamic memory class Node { public: int data; Node *leftc; Node *rightc; }; October 2004 John Edgar 63
Summary
Summary Trees Terminology: paths, height, node relationships, … Binary search trees Traversal ▪ Post-order, pre-order, in-order Operations ▪ Insert, delete, search Balanced trees Binary search tree operations are efficient for balanced trees October 2004 John Edgar 65
Readings Carrano October 2004 Ch. 10 John Edgar 66
- Slides: 66