MCA 301 Design and Analysis of Algorithms Instructor

  • Slides: 44
Download presentation
MCA 301: Design and Analysis of Algorithms Instructor Neelima Gupta ngupta@cs. du. ac. in

MCA 301: Design and Analysis of Algorithms Instructor Neelima Gupta ngupta@cs. du. ac. in

Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs. du. ac. in

Expected Running Times and Randomized Algorithms Instructor Neelima Gupta ngupta@cs. du. ac. in

Expected Running Time of Insertion Sort (at rth position) x 1, x 2, .

Expected Running Time of Insertion Sort (at rth position) x 1, x 2, . . , xi-1, xi, . . . . …, xn For I = 2 to n Insert the ith element xi in the partially sorted list x 1, x 2, . . , xi-1.

Expected Running Time of Insertion Sort �Let Xi be the random variable which represents

Expected Running Time of Insertion Sort �Let Xi be the random variable which represents the number of comparisons required to insert ith element of the input array in the sorted sub array of first i-1 elements. �Xi : can take values 1…i-1 (denoted by xi 1, xi 2, . . . . …, xii) E(Xi) = Σj xijp(xij ) where E(Xi) is the expected value Xi And, p(xij) is the probability of inserting xi in the jth position 1≤j≤i

Expected Running Time of Insertion Sort (at jth position) x 1, x 2, .

Expected Running Time of Insertion Sort (at jth position) x 1, x 2, . . , xi-1, xi, . . . . …, xn How many comparisons it makes to insert ith element in jth position?

�Position i i-1 i-2. . . 2 1 # of Comparisions 1 2 3.

�Position i i-1 i-2. . . 2 1 # of Comparisions 1 2 3. . . i-1 Note: Here, both position 2 and 1 have # of Comparisions equal to i-1. Why? Because to insert element at position 2 we have to compare with previously first element. and after that comparison we know which of them come first and which at second.

Thus, E(Xi) = (1/i) { i-1Σk=1 k + (i-1) } where 1/i is the

Thus, E(Xi) = (1/i) { i-1Σk=1 k + (i-1) } where 1/i is the probability to insert at jth position in the i possible positions. For n elements, E(X 1 + X 2 +. . . +Xn) = nΣi=2 E(Xi) = nΣ } i-1Σ (1/i) { i=2 k=1 k + (i-1) = (n-1)(n-4)/4

For n number of elements, expected time taken is, T = nΣi=2 (1/i) {

For n number of elements, expected time taken is, T = nΣi=2 (1/i) { i-1Σk=1 k + (i-1) } where 1/i is the probability to insert at rth position in the i possible positions. E(X 1 + X 2 +. . . +Xn) = nΣi=1 E(Xi) Where, Xi is expected value of inserting Xi element. T = (n-1)(n-4)/4 Therefore average case of insertion sort takes Θ(n 2)

Quick-Sort � Pick the first item from the array--call it the pivot � Partition

Quick-Sort � Pick the first item from the array--call it the pivot � Partition the items in the array around the pivot so all elements to the left are to the pivot and all elements to the right are greater than the pivot � Use recursion to sort the two partitions partition 1: items pivot partition: items > pivot

Quicksort: Expected number of comparisons � Partition may generate splits (0: n-1, 1: n-2,

Quicksort: Expected number of comparisons � Partition may generate splits (0: n-1, 1: n-2, 2: n-3, … , n-2: 1, n-1: 0) each with probability 1/n �If T(n) is the expected running time,

Randomized Quick-Sort � Pick an element from the array--call it the pivot � Partition

Randomized Quick-Sort � Pick an element from the array--call it the pivot � Partition the items in the array around the pivot so all elements to the left are to the pivot and all elements to the right are greater than the pivot � Use recursion to sort the two partitions partition 1: items pivot partition: items > pivot

Remarks �Not much different from the Q-sort except that earlier, the algorithm was deterministic

Remarks �Not much different from the Q-sort except that earlier, the algorithm was deterministic and the bounds were probabilistic. �Here the algorithm is also randomized. We pick an element to be a pivot randomly. Notice that there isn’t any difference as to how does the algorithm behave there onwards? �In the earlier case, we can identify the worst case input. Here no input is worst case.

Randomized Select

Randomized Select

Randomized Algorithms �A randomized algorithm performs coin tosses (i. e. , uses random bits)

Randomized Algorithms �A randomized algorithm performs coin tosses (i. e. , uses random bits) to control its execution �i ← random() if i = 0 do A … else { i. e. i = 1} do B … �Its running time depends on the outcomes of the coin tosses

Assumptions �coins are unbiased, and �coin tosses are independent �The worst-case running time of

Assumptions �coins are unbiased, and �coin tosses are independent �The worst-case running time of a randomized algorithm may be large but occurs with very low probability (e. g. , it occurs when all the coin tosses give “heads”)

Monte Carlo Algorithms �Running times are guaranteed but the output may not be completely

Monte Carlo Algorithms �Running times are guaranteed but the output may not be completely correct. �Probability of error is low.

Las Vegas Algorithms �Output is guaranteed to be correct. �Bounds on running times hold

Las Vegas Algorithms �Output is guaranteed to be correct. �Bounds on running times hold with high probability. �What type of algorithm is Randomized Qsort?

Why expected running times? �Markov’s inequality P( X > k E(X)) < 1/k i.

Why expected running times? �Markov’s inequality P( X > k E(X)) < 1/k i. e. the probability that the algorithm will take more than O(2 E(X)) time is less than 1/2. Or the probability that the algorithm will take more than O(10 E(X)) time is less than 1/10. This is the reason why Qsort does well in practice.

Markov’s Bound P(X<k. M)< 1/k , where k is a constant. Chernouff’s Bound P(X>2μ)<

Markov’s Bound P(X<k. M)< 1/k , where k is a constant. Chernouff’s Bound P(X>2μ)< ½ A More Stronger Result P(X>k μ )< 1/nk, where k is a constant.

Binary Search Tree � What �A is a binary search tree? BST is a

Binary Search Tree � What �A is a binary search tree? BST is a possibly empty rooted tree with a key value, a possible empty left subtree and a possible empty right subtree. � Each of the left subtree and the right subtree is a BST.

Binary Search Tree � Pick the first item from the array--call it the pivot…it

Binary Search Tree � Pick the first item from the array--call it the pivot…it becomes the root of the BST. � Partition the items in the array around the pivot so that all elements to the left are the pivot and all elements to the right are greater than the pivot � Recursively Build a BST on each partition. They become the left and the right sub-tree of the root.

Binary Search Tree Consider the following input: 1, 2, 3 ………………… 10, 000. �What

Binary Search Tree Consider the following input: 1, 2, 3 ………………… 10, 000. �What is the time for construction? �Search Time?

Randomly Built Binary Search Tree � Pick an item from the array randomly --call

Randomly Built Binary Search Tree � Pick an item from the array randomly --call it the pivot…it becomes the root of the BST. � Partition the items in the array around the pivot so that all elements to the left are the pivot and all elements to the right are greater than the pivot � Recursively Build a BST on each partition. They become the left and the right sub-tree of the root.

Example �Consider the input 10, 20, 30, 40, 50, 60, 70, 80, 90, 100.

Example �Consider the input 10, 20, 30, 40, 50, 60, 70, 80, 90, 100.

Height of the RBST WLOG, assume that the keys are distinct. (What if they

Height of the RBST WLOG, assume that the keys are distinct. (What if they are not? ) �Rank(x) = number of elements < x �Let Xi : height of the tree rooted at a node with rank=i. �Let Yi : exponential height of the tree=2^Xi �Let H : height of the entire BST, then H=max{H 1, H 2} + 1 where H 1 : ht. of left subtree H 2 : ht. of right subtree

�Y=2^H =2. max{2^H 1, 2^H 2} �E(EH(T(n))): Expected value of exponential ht. of the

�Y=2^H =2. max{2^H 1, 2^H 2} �E(EH(T(n))): Expected value of exponential ht. of the tree with ‘n’ nodes. �E(EH(T(n))) =2/n ∑ max{EH(T(k)), EH(T(n-1 -k))} =O(n^3) �E(H(T(n))) =E(log (EH(T(n)))) = O(log n)

�Construction Time? �Search Time? �What is the worst case input?

�Construction Time? �Search Time? �What is the worst case input?

Acknowledgements �Kunal Verma �Nidhi Aggarwal �And other students of MSc(CS) batch 2009.

Acknowledgements �Kunal Verma �Nidhi Aggarwal �And other students of MSc(CS) batch 2009.

Hashing �Motivation: symbol tables �A compiler uses a symbol table to relate symbols to

Hashing �Motivation: symbol tables �A compiler uses a symbol table to relate symbols to associated data � Symbols: variable names, procedure names, etc. � Associated data: memory location, call graph, etc. �For a symbol table (also called a dictionary), we care about search, insertion, and deletion �We typically don’t care about sorted order

Hash Tables �More formally: �Given a table T and a record x, with key

Hash Tables �More formally: �Given a table T and a record x, with key (= symbol) and satellite data, we need to support: Insert (T, x) � Delete (T, x) � Search(T, x) � �We want these to be fast, but don’t care about sorting the records �The structure we will use is a hash table �Supports all the above in O(1) expected time!

Hash Functions �Next problem: collision U (universe of keys) k 2 0 h(k 1)

Hash Functions �Next problem: collision U (universe of keys) k 2 0 h(k 1) h(k 4) k 1 k 4 K (actual keys) T k 5 h(k 2) = h(k 5) k 3 h(k 3) m-1

Resolving Collisions �How can we solve the problem of collisions? �One of the solution

Resolving Collisions �How can we solve the problem of collisions? �One of the solution is : chaining �Other solutions: open addressing

Chaining �Chaining puts elements that hash to the same slot in T a linked

Chaining �Chaining puts elements that hash to the same slot in T a linked list: U (universe of keys) k 4 K k 5 (actual k 7 keys) k 6 k 8 k 1 k 4 —— k 5 k 2 —— —— —— k 1 k 2 —— k 3 —— k 8 —— k 6 —— k 7 ——

Chaining �How do we insert an element? U (universe of keys) k 4 K

Chaining �How do we insert an element? U (universe of keys) k 4 K k 5 (actual k 7 keys) k 6 k 8 k 1 k 4 —— k 5 k 2 —— —— —— k 1 k 2 T —— k 3 —— k 8 —— k 6 —— k 7 ——

Chaining �How do we delete an element? U (universe of keys) k 4 K

Chaining �How do we delete an element? U (universe of keys) k 4 K k 5 (actual k 7 keys) k 6 k 8 k 1 k 4 —— k 5 k 2 —— —— —— k 1 k 2 T —— k 3 —— k 8 —— k 6 —— k 7 ——

Chaining �How do we search for a element with a T given key? U

Chaining �How do we search for a element with a T given key? U (universe of keys) k 4 K k 5 (actual k 7 keys) k 6 k 8 k 1 k 4 —— k 5 k 2 —— —— —— k 1 k 2 —— k 3 —— k 8 —— k 6 —— k 7 ——

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot �Given n keys and m slots in the table: the load factor = n/m = average # keys per slot �What will be the average cost of an unsuccessful search for a key?

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot �Given n keys and m slots in the table, the load factor = n/m = average # keys per slot �What will be the average cost of an unsuccessful search for a key? A: O(1+ )

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot �Given n keys and m slots in the table, the load factor = n/m = average # keys per slot �What will be the average cost of an unsuccessful search for a key? A: O(1+ ) �What will be the average cost of a successful search?

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely

Analysis of Chaining �Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot �Given n keys and m slots in the table, the load factor = n/m = average # keys per slot �What will be the average cost of an unsuccessful search for a key? A: O(1+ ) �What will be the average cost of a successful search? A: O((1 + )/2) = O(1 + )

Analysis of Chaining Continued �So the cost of searching = O(1 + ) �If

Analysis of Chaining Continued �So the cost of searching = O(1 + ) �If the number of keys n is proportional to the number of slots in the table, what is ? � A: = O(1) �In other words, we can make the expected cost of searching constant if we make constant

If we could prove this, P(failure)<1/k (we are sort of happy) P(failure)<1/nk (most of

If we could prove this, P(failure)<1/k (we are sort of happy) P(failure)<1/nk (most of times this is true and we’re happy ) P(failure)<1/2 n (this is difficult but still we want this)

Acknowledgements �Kunal Verma �Nidhi Aggarwal �And other students of MSc(CS) batch 2009.

Acknowledgements �Kunal Verma �Nidhi Aggarwal �And other students of MSc(CS) batch 2009.

END

END