Computer Algorithms CISC 4080 CIS Fordham Univ Instructor

  • Slides: 31
Download presentation
Computer Algorithms CISC 4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 1

Computer Algorithms CISC 4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 1

Outline • What is algorithm: word origin, first algorithms, algorithms of today’s world •

Outline • What is algorithm: word origin, first algorithms, algorithms of today’s world • Pseudocode: convention and examples • Introduction to algorithm analysis: fibonacci seq calculation • counting number of “computer steps” • recursive formula for running time of recursive algorithm • math help: math. induction • Asymptotic notations • Algorithm running time classes: P, NP 2

What are Algorithms? 3

What are Algorithms? 3

Algorithms Etymology CS 477/677 - Lecture 1 4

Algorithms Etymology CS 477/677 - Lecture 1 4

Decimal System & algorithms • • • Imagine adding two Roman numerals? What is

Decimal System & algorithms • • • Imagine adding two Roman numerals? What is Decimal system, invented in India around AD 600 • Uses only 10 symbols (0, 1, … 9) • write large number compactly • easy to perform arithmetic operations 5

Oldest Algorithms • Al Khwarizmi laid out basic methods for • adding, multiplying and

Oldest Algorithms • Al Khwarizmi laid out basic methods for • adding, multiplying and dividing numbers • extracting square roots • calculating digits of pi, … • These procedures were precise, unambiguous, mechanical, efficient, correct. i. e. , they were algorithms, a term coined to honor Al Khwarizmi after decimal system was adopted in Europe many centuries later. 6

Some Key Concepts • Algorithm: a finite, definitive, effective procedure that takes some inputs

Some Key Concepts • Algorithm: a finite, definitive, effective procedure that takes some inputs and generate some output • Problem: describe what’s the input, and what’s the desired output • Problem instance: a particular input • e. g. , a[1… 9]={3, 4, 1, 2, 5, 6, 0, 10, 7} for a sorting problem • or the above a, and 3 as input to a searching problem • Pseudocode: language-neutral, data type 7 neural

Algorithms that you’ve seen • • • Linear Search: search for an item with

Algorithms that you’ve seen • • • Linear Search: search for an item with a matching key in an array (unsorted) Binary Search Bubble Sort, Insertion Sort, Selection Sort, Radix Sort Search in a binary search tree: algorithm + data structure Graph algorithms: traversal Group Practice: • Idea (how does it work), correctness, efficiency 8

Algorithms that you’ve seen • • • Linear Search: search for an item with

Algorithms that you’ve seen • • • Linear Search: search for an item with a matching key in an array (unsorted) Binary Search Bubble Sort, Insertion Sort, Selection Sort, Radix Sort Search in a binary search tree: algorithm + data structure Graph algorithms: BFS or DFS traversal 9

Why study algorithms? • • • Internet: web search, packet routing… Biology: Bioinformatics combines

Why study algorithms? • • • Internet: web search, packet routing… Biology: Bioinformatics combines mathematics, statistics and computer science to study biological molecules, such as DNA, RNA, and protein structures… • Edit distance (6. 3) for suggesting correction in spell checker ==> sequence alignment (DNA, RNA, protein) Multimedia, Computer graphics • • • Line generating algorithms: • Given coordinate of two points A(x 1, y 1) and B(x 2, y 2). The task to find all the intermediate points required for drawing line AB on the computer screen of pixels. Note that every pixel has integer coordinates. Compression algorithms used in MP 3, JPEG technology One component in MP 3 is Huffman method (a greedy algorithm) 10

Why study algorithms? • • Computers system: circuit layout, scheduling algorithms for OS or

Why study algorithms? • • Computers system: circuit layout, scheduling algorithms for OS or data center, compiler (code optimization), . . . Security: • • • Social networks: • • • encryption algorithms (such as RSA public/private key algorithms, among many algorithms) cryptographic hash function: generate checksum to verify the authenticity of data Social network analysis Link prediction: predict whethere will be links between two nodes ==> Graph problem Machine Learning algorithms used everywhere : advertisement, credit risk management, intrusion detection……. 11

Learning Goals Learn algorithms (sorting and searching, arithmetics, graph algorithms, linear programming algorithms), and

Learning Goals Learn algorithms (sorting and searching, arithmetics, graph algorithms, linear programming algorithms), and practice implementing them in C++ (C++ STL) • Algorithms analysis: correctness, efficiency (running time and space requirement) • Complexity analysis of problem itself: • Lower bound analysis: comparison based sorting cannot do better than nlogn • NP complete problem • Learn paradigms such as divide an conquer, greedy algorithms, randomization, dynamic programming and linear programming for design algorithmic solution to new problems • 12

Example: Selection Sort • • Input: a list of elements, L[1…n] Output: rearrange elements

Example: Selection Sort • • Input: a list of elements, L[1…n] Output: rearrange elements in List, so that L[1]<=L[2]<=L[3]<…L[n] • Note that “list” is an ADT (could be implemented using array, linked list) • Ideas (in two sentences) • First, find location of smallest element in sub list L[1…n], and swap it with first element in the sublist • repeat the same procedure for sublist L[2…n], L[3…n], …, L[n-1…n] 13

Selection Sort (idea=>pseudocode) for i=1 to n-1 // find location of smallest element in

Selection Sort (idea=>pseudocode) for i=1 to n-1 // find location of smallest element in sub list L[i…n] min. Index = i; for k=i+1 to n if L[k]<L[min. Index]: min. Index=k //swap it with first element in the sublist if (min. Index!=i) swap (L[i], L[min. Index]); // Correctness: L[i] is now the i-th smallest element 14

Overview of Lab 1 • Implement and test three basic sorting algorithms • Start

Overview of Lab 1 • Implement and test three basic sorting algorithms • Start from idea of the algorithms, keep refine it until it becomes code • Performance measurement: how long does it take for each sorting algorithms (functions) to sort an array of a given size • Basis of lab 2 and lab 3 15

Introduction to algorithm analysis • Consider calculation of Fibonacci sequence, in particular, the n-th

Introduction to algorithm analysis • Consider calculation of Fibonacci sequence, in particular, the n-th number in sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, … 16

Fibonacci Sequence • • 0, 1, 1, 2, 3, 5, 8, 13, 21, 34,

Fibonacci Sequence • • 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, … Formally, • Problem: How to calculate n-th term, e. g. , what is F 100, F 200? 17

A recursive algorithm • Three questions: • • • Is it correct? • yes,

A recursive algorithm • Three questions: • • • Is it correct? • yes, as the code mirrors the definition… How much time does it take? Can we do better? (faster? ) 18

In pursuit of better algorithms • We want to solve problems using less resource:

In pursuit of better algorithms • We want to solve problems using less resource: • Space: how much memory is needed? • Time: how fast can we get the result? • Usually, the bigger input, the more memory it takes and longer it takes • it takes longer to calculate 200 -th number in Fibonacci sequence than the 10 th number • it takes longer to sort larger array • Efficient algorithms are critical for large input size/problem instance • Finding F 100, Searching Web … • Two different approaches to evaluate efficiency of algorithms: Measurement vs. analysis 19

Experimental approach • • Measure how much time elapses from algorithm starts to finishes

Experimental approach • • Measure how much time elapses from algorithm starts to finishes needs to implement, instrument and deploy e. g. , t 1 = gettimeofday(&t 1, NULL); Bubble. Sort (a, size); gettimeofday (&t 2, NULL); double time. In. Seconds = (t 2. tv_sec-t 1. tv_sec) + (t 2. tv_usec-t 2. tv_usec)/1000000. 0; • results are realistic, specific and random • specific to language, run time system (Java VM, OS), caching effect, other processes running • cannot shed light on: larger input size(not always possible to test all input size), faster CPU, … • Measurement is important for a “production” system/end product; but not informative for algorithm efficiency studies/comparison/prediction 20

Example (Fib 1: recursive) n T(n)of. Fib 1 F(n) 10 3 e-06 55 11

Example (Fib 1: recursive) n T(n)of. Fib 1 F(n) 10 3 e-06 55 11 2 e-06 89 12 4 e-06 144 13 7 e-06 233 14 1. 1 e-05 377 15 1. 7 e-05 610 16 2. 9 e-05 987 17 4. 7 e-05 1597 18 7. 6 e-05 2584 19 0. 000122 4181 20 0. 000198 6765 21 0. 000318 10946 22 0. 000515 17711 23 0. 000842 28657 24 0. 001413 46368 25 0. 002261 75025 26 0. 003688 121393 27 0. 006264 196418 28 0. 009285 317811 29 0. 014995 514229 30 0. 02429 832040 31 0. 039288 1346269 32 0. 063543 2178309 33 0. 102821 3524578 34 0. 166956 5702887 35 0. 269394 9227465 36 0. 435607 14930352 37 0. 701372 24157817 38 1. 15612 39088169 39 1. 84103 63245986 40 2. 9964 102334155 41 4. 85536 165580141 42 7. 85187 267914296 43 12. 6805 433494437 44 20. 513 701408733 Time (in seconds) n Running time seems to grows exponentially as n increases Model fitting to find out T(n)? 21

Example (Fib 2: iterative) 10 11 12 13 14 15 16 17 18 19

Example (Fib 2: iterative) 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 … 1000 1 e-06 55 1 e-06 89 0 144 0 233 0 377 0 610 0 987 0 1597 0 2584 0 4181 0 6765 0 10946 0 17711 0 28657 0 46368 0 75025 0 121393 0 196418 1 e-06 317811 0 514229 1 e-06 832040 0 1346269 0 2178309 0 3524578 1 e-06 5702887 0 9227465 0 14930352 0 24157817 1 e-06 39088169 0 63245986 0 102334155 1 e-06 165580141 0 267914296 1 e-06 433494437 0 701408733 8 e-06 … Time (in seconds) n Increase very slowly as n increases Model fitting to find out T(n)? 22

Analytic approach • Is it possible to find out how running time grows when

Analytic approach • Is it possible to find out how running time grows when input size grows analytically? • Is running time a constant, increases linearly, logarithmically, quadratically, … exponentially? • • Approach: analyze pseudocode, express total number of steps in terms of input size, and study its order of growth • results are general: not specific to language, run time system, caching effect, other processes sharing computer • shed light on effects of larger problem size, faster CPU, … Analysis is appropriate for algorithm efficiency studies, comparison/prediction 23

Running time analysis • Given an algorithm in pseudocode or actual program • Express

Running time analysis • Given an algorithm in pseudocode or actual program • Express total number of computer steps (primitive operations) executed as a function of the size of input (n) • size of input: size of an array, polynomial degree, # of elements in a matrix, vertices and edges in a graph, or # of bits in the binary representation of input • Computer steps: arithmetic operations, data movement, control, decision making (if, while), comparison, … • each step take a constant amount of time, i. e. , independent of input size 24

Case Studies: Fib 1(n) • Let T(n) be number of computer steps needed to

Case Studies: Fib 1(n) • Let T(n) be number of computer steps needed to compute fib 1(n) • • • T(0)=1: when n=0, first step is executed T(1)=2: when n=1, first two steps are executed For n >1, T(n)=T(n-1)+T(n-2)+3: first two steps are executed, fib 1(n-1) is called (with T(n-1) steps), fib 1(n-2) is called (T(n-2) steps), return values are added (1 step) Can you see that T(n) > Fn ? How big is T(n)? 25

Running Time analysis • Let T(n) be number of computer steps to compute fib

Running Time analysis • Let T(n) be number of computer steps to compute fib 1(n) • • Analyze running time of recursive algorithm • • T(0)=1 T(1)=2 T(n)=T(n-1)+T(n-2)+3, n>1 first, write a recursive formula for its running time then, recursive formula => closed formula, asymptotic result How fast does T(n) grow? Can you see that T(n) > Fn ? How big is T(n)? 26

Mathematical Induction • • F 0=0, F 1=1, Fn=Fn-1+Fn-2 We will show that Fn

Mathematical Induction • • F 0=0, F 1=1, Fn=Fn-1+Fn-2 We will show that Fn >= 20. 5 n, for n >=6 using strong mathematical induction technique 27

Exponential algorithms • • Running time of Fib 1: T(n)> 20. 694 n Running

Exponential algorithms • • Running time of Fib 1: T(n)> 20. 694 n Running time of Fib 1 is exponential in n • calculate F 200, it takes 2138 computer steps • on NEC Earth simulator, which executes 40 trillion (1012) steps per second, this takes at least 292 seconds. • Moore’s law (computer speeds double about every 18 months) only allows us to calculate one more term next year… • Algorithms with exponential running time are not efficient 28

Can we do better? • • Correctness? Analyze running time of iterative (non-recursive) algorithm:

Can we do better? • • Correctness? Analyze running time of iterative (non-recursive) algorithm: T(n)=1 // if n=0 return 0 +n // create an array of f[0…n] +2 // f[0]=0, f[1]=1 +(n-1) // for loop: repeated for n-1 times = 2 n+2 • T(n) is a linear function of n, or fib 2(n) has linear running time 29

Alternatively… Estimation based upon CPU: takes 1000 us, takes 200 n us each assignment

Alternatively… Estimation based upon CPU: takes 1000 us, takes 200 n us each assignment takes 60 us addition and assignment takes 800 us… • How long does it take for fib 2(n) finish? T(n)=1000 +200 n+2*60+(n-1)*800=1000 n+320 // in unit of us • What about for loop itself? • We could take that into account by estimating its time too… • Again: T(n) is a linear function of n • Constants are not important: on different computers/CPU? • Complexity in systems (caching, OS scheduling) makes it 30 pointless to do such fine-grained analysis anyway!

Readings • Chapter 1 of DPV 31

Readings • Chapter 1 of DPV 31