What is Computer Science What is it that




















- Slides: 20

What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread which gathers these disparate branches into a single discipline? My answer to these questions is simple --- it is the art of programming a computer. It is the art of designing efficient and elegant methods of getting a computer to solve problems, theoretical or practical, small or large, simple or complex. C. A. R. (Tony)Hoare CPS 100 1. 1

Why is programming fun? What delights may its practitioner expect as a reward? First is the sheer joy of making things Second is the pleasure of making things that are useful Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts Fourth is the joy of always learning Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. Fred Brooks CPS 100 1. 2

Efficient Programming l Designing and building efficient programs efficiently requires knowledge and practice Ø Hopefully the programming language helps, it’s not intended to get in the way Ø Object-oriented concepts, and more general programming concepts help in developing programs Ø Knowledge of data structures and algorithms helps l Tools of the engineer/scientist/programmer/designer Ø A library or toolkit is essential, STL or wheel re-invention? Ø Programming: art, science, engineering? None or All? Ø Mathematics is a tool Ø Design Patterns are a tool CPS 100 1. 3

Course Overview l Lectures, Recitations, Quizzes, Programs Ø Recitation based on questions given out in previous week • Discuss answers, answer new questions, small quiz • More opportunities for questions to be answered. Ø Lectures based on readings, questions, programs • Online quizzes used to motivate/ensure reading • In-class questions used to ensure understanding Ø Programs • Theory and practice of data structures and OO programming • Fun, practical, tiring, … • Weekly programs and longer programs l Exams/Tests Ø Semester: closed book Ø Final: open book CPS 100 1. 4

Questions l If you gotta ask, you’ll never know Ø Louis Armstrong: “What’s Jazz? ” l If you gotta ask, you ain’t got it Ø Fats Waller: “What’s rhythm? ” l What questions did you ask today? Ø Arno Penzias CPS 100 1. 5

Tradeoffs l This course is about all kinds of tradeoffs: programming, structural, algorithmic Ø Programming: simple, elegant, quick to run/to program • Tension between simplicity and elegance? Ø Structural: how to structure data for efficiency • What issues in efficiency? Time, space, programmer-time Ø l Algorithmic: similar to structural issues How do we decide which choice to make, what tradeoffs are important? CPS 100 1. 6

See readwords. cpp l This reads words, how can we count different/unique words? tvector<string> list; string filename, word; cin >> filename; ifstream input(filename. c_str()); CTimer timer; timer. Start(); while (input >> word) { list. push_back(word); } timer. Stop(); cout << "read " << list. size() << " words in "; cout << timer. Elapsed. Time() << " seconds" << endl; CPS 100 1. 7

Tracking different/unique words l We want to know how many times ‘the’ occurs Ø Do search engines do this? Does the number of occurrences of “basketball” on a page raise the priority of a webpage in some search engines? • Downside of this approach for search engines? l Constraints on solving this problem Ø We must read every word in the file (or web page) Ø We must search to see if the word has been read before Ø We must process the word (bump a count, store the word) Ø CPS 100 Are there fundamental limits on any of these operations? Where should we look for data structure and algorithmic improvements? 1. 8

Search: measuring performance l How fast is fast enough? bool search(const tvector<string> & a, const string & key) // pre: a contains a. size() entries // post: return true if and only if key found in a { int k; int len = a. size(); for(k=0; k < len; k++) if (a[k] == key) return true; return false; } l l C++ details: parameters? Return values? Vectors? How do we measure performance of code? Of algorithm? Ø Does processor make a difference? PIII, G 4, ? ? ? CPS 100 1. 9

Tradeoffs in reading and counting l Read words, then sort, determine # unique words? Ø frog, rat, tiger, tiger l If we look up words as we're reading them and bump a counter if we find the word, is this slower than previous idea? Ø How do we look up word, how do we add word l Are there kinds of data that make one approach preferable? Ø What is best case, worst case, average case? l What's one function spec & implememtation to count # unique words in a list/vector of words Ø See readwords 3. cpp CPS 100 1. 10

Who is Alan Perlis? l l l It is easier to write an incorrect program than to understand a correct one Simplicity does not precede complexity, but follows it If you have a procedure with ten parameters you probably missed some If a listener nods his head when you're explaining your program, wake him up Programming is an unnatural act Won first Turing award http: //www. cs. yale. edu/homes/perlis-alan/quotes. html CPS 100 1. 11

Review/Preview: Anagrams/Jumbles l l Brute-force approach to finding anagrams/solving Jumbles Ø Brute-force often thought of as “lack of thought” Ø What if the better way requires too much thought? Ø What if there’s nothing better? nelir, nelri, neilr, neirl, nerli, neril, nleir, nleri, nlier, nlire, nlrei, nlrie, nielr, nierl, niler, nilre, nirel, … lenir, lenri, leinr, leirn, lerni, lerin, liner Ø What’s the problem here? Ø Is there a better method? CPS 100 1. 12

Brute force? permana. cpp // find anagram of word in word. Source // list is a vector [0, 1, 2, …, n] Permuter p(list); int count = 0; string copy(word); // makes copy the right length for(p. Init(); p. Has. More(); p. Next()) { p. Current(list); for(k=0; k < list. size(); k++) { copy[k] = word[list[k]]; } if (word. Source. contains(copy)) { cout << "anagram of " << copy << endl; break; // find first anagram only } } CPS 100 1. 13

Quantifying brute force for anagrams l On one machine make/test a word takes 10 -5 seconds/word Ø 9! is 362, 880: how long does this take? Ø What about a ten-letter word? l We’re willing to do some pre-processing to make the time to find anagrams quicker Ø Often find that some initialization/up-front time or cost saves in the long run Ø We need a better method than trying all possible permutations Ø What properties do words share that are anagrams? CPS 100 1. 14

Toward a faster anagram finder l Words that are anagrams have the same letters; use a letter fingerprint or signature/histogram to help find anagrams Ø Count how many times each letter occurs: “teacher” 1 0 2 0 0 1 0 0 0 “cheater” 1 0 2 0 0 1 0 0 0 0 0 l Store words, but use fingerprint for comparison when searching for an anagram Ø How to compare fingerprints using operator == Ø How to compare fingerprints using operator < l How do we make client programmers unaware of fingerprints? Should we do this? CPS 100 1. 15

Another anagram method l Instead of fingerprint/histogram idea, use sorted form of word Ø “gable” and “bagel” both yield “abegl” Ø Anagrams share same sorted form l Similarities/differences to histogram/fingerprint idea? Ø Both use canonical or normal/normalized form Ø Normalized form used for comparison, not for printing Ø When should this normal form be created? l When is one method preferred over the other? Ø Big words, little words? Different alphabets? DNA vs English? CPS 100 1. 16

OO and C++ features we’ll use l We’ll use an adapter or wrapper class called Anaword instead of a string Ø Clients can treat Anaword objects like strings, but the objects are better suited for finding anagrams than strings Ø The Anaword for “bear” prints as “bear” but compares to other Anaword objects as 1100100000010000 l C++ allows us to overload operators to help, not necessary but good cosmetically Ø Relational operators == and < • What about other operators: >, <=, >=, and != Ø l Stream operator << How should we implement overloaded operators? CPS 100 1. 17

Overloaded operators l In C++ we can define what operator == and operator < mean for an object (and many other operators as well) Ø This is syntactically convenient when writing code Ø C++ details can be cumbersome (see Tapestry Howto E) l In anaword. h there are four overloaded operators Ø Ø Ø l What about > and >= ; what about != ; others? What about printing, can we overload operator << ? How do we access private data for printing? Comparing? Overloaded operators are not necessary, syntactic sugar. CPS 100 1. 18

Overloaded operators (continued) l Typically operators need access to internal state of an object Ø Relational operators for Date, string, Big. Int? Ø Where is “internal state”? l For technical reasons sometimes operators should not be member functions: Big. Int b = enter. Big. Value(); if (b < 2) … if (2 > b) … Ø l We’d like to use both if statements, only the first can be implemented using Big. Int: : operator < (why? ) Use helper member functions: equals, less, to. String Ø Implement overloaded operators using helpers CPS 100 1. 19

Anaword objects with options l Can we use different canonical forms in different contexts? Ø Could have Anaword, Finger. Print. Anaword, Sort. Anaword Ø What possible issues arise? What behavior is different in subclasses? • If there’s no difference in behavior, don’t have subclasses l Alternative, make canonical/normalize method a class Ø Turn a function/idea into a class, then let the class vary to encapsulate different methods Ø Normalization done at construction time or later Ø Where is normalizer object created? When? CPS 100 1. 20