Compsci 201 Collections Hashing Objects Owen Astrachan olacs
Compsci 201, Collections, Hashing, Objects Owen Astrachan ola@cs. duke. edu http: //bit. ly/201 spring 19 February 1, 2019 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 1
G is for … • Git • Version control that's so au courant • Garbage Collection • Java recycles • Google • How to find Stack Overflow 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 2
PFFDo. SMo. Y • Generic classes: Array. List to Hash. Set • From Array. List to Hash. Set to Collections to … • From Object. equals to Object. hash. Code • Everything is an Object, what can an object do? • Using arrays and chars for APT problems • Toward understanding maps 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 3
WOTO Review http: //bit. ly/201 spring 19 -jan 30 -1 • Did this very quickly in class 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 4
Diyad Array. List Review https: //coursework. cs. duke. edu/201 spring 19/diyad 201 • Array. List is a wrapper-class, array-like behavior but better, e. g. , growth when needed • Internal state provides array characteristics • Methods allow for better functionality 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 5
Diyad Array. List Growth • When internal array full? Create new, copy, use • Efficient add, get, set when done repeatedly • Not possible if resize with +1, +1000 • Is possible if resize with *2 or *1. 25 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 6
Generic Array. List • Rather than String, use generic type parameter • Can use E, T, Type, any string in <E> • Similar to code for Growable. String. Array. List • java. util. List • Interface 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 7
Can E be anything? String, Point, … • Method. equals that works as expected for E! • Internal array my. Storage contains Objects • Conforming. Array. List<String> • What. equals is called? Object or String? • Runtime decision, not compile time decision • What does elt actually reference? 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 8
Why Diyad? • Traditionally use Array. List<E> -- client code • Understand methods via API • Problem solving in many contexts • Efficiency: a. get(1)as fast as a. get(1000) • Why efficient? Understanding by analysis • From the internal array which is efficient • From doubling on resize rather than adding one 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 9
Toward Understanding Hash. Set • Adding objects to Hash. Set<. . >, avoid duplicates • We’ll see with Point class, doesn’t work • We’ll see with String class, does work • Just as we needed to add. equals() … • Need some knowledge of Object and internals of Hash. Set<. . >, how does set. add(X) work? • Every object can convert itself to a number • Ask not what. Compsci you 201, can do to an object … Spring 2019: Collections, 2/1/19 Hashing 10
Modified Point Class • Why isn't output as expected? • Initially why didn't Array. List work with Point? • https: //coursework. cs. duke. edu/201 spring 19/classwork-spring 19 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 11
Hollywood Principle: . equals • Why program designers write Point. equals()? • To interact well with Java classes • We don't call. equals, other classes do • We called Array. List. contains, it called. equals • Object has. equals, and. to. String • Overload these for Point class • 2/1/19 http: //wiki. c 2. com/? Hollywood. Principle Compsci 201, Spring 2019: Collections, Hashing 12
Making. contains efficient • Why is Array. List. contains(. . ) slow? • Search through entire list to find something • If list is sorted can we do better? • Think of a number between 1 and 1, 024, I'll tell you high, low, correct: how many guesses needed? • How do you search for a book in the stacks? • That's not what you do in the stacks? • What about in ancient times … 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 13
Finding an Object's number. . • Every object hash. Code() method • Returns int value, used as “locker number” • Could return 39, 2, 57, … even -321 • Cannot guarantee different for every Object! • Use. equals in locker • Search items in same locker 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 14
Ideal world? Real world! 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 15
Point. hash. Code • Convert a Point to a number • Try to make every point a different number • That's not possible!! • For method below, what non-equal points have same. hash. Code()? 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 16
Inefficient but Correct. hash. Code • Suppose. hash. Code()simply returns 5 • Every Point goes in the same locker • There always collisions, but we try to minimize them. How are collisions resolved? • Can we modify Point. Driver. java to stresstest? • How many different points can be made? 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 17
The hash. Code contract • Every object hash. Code() method • Inherited from Object, but typically overridden • Use @Override and read online • Must respect. equals(): If a. equals(b) ? • a. hash. Code() == b. hash. Code() • Converse not true! There will be collisions 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 18
WOTO (correctness counts) http: //bit. ly/201 spring 19 -feb 1 -1 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 19
Maria Klawe • President of Harvey Mudd • Dean of Engineering at Princeton, ACM Fellow, College Dropout (and reenroller) I personally believe that the most important thing we have to do today is use technology to address societal problems, especially in developing regions Coding is today's language of creativity. All our children deserve a chance to become creators instead consumers of computer science. 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 20
Default: Object. equals, . hash. Code • For Objects p and q: • p. equals(q) is the same as p == q • Do p and q reference/point to same object • For Object p • p. hash. Code() is location in memory of object • Thus: if p == q then • p. hash. Code() == q. hash. Code() 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 21
Array. List and Hash. Set • Both have. add, . add. All, and more • Both iterable: for(Elt e : collection) • Both have. contains leveraging. equals • Hash. Set also uses. hash. Code to reduce the collection iterated over: locker collisions • Object hygiene when developing your classes • . to. String(), . equals(), . hash. Code() 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 22
When Strings Collide https: //www. youtube. com/watch? v=He. TSh. E 2 Pi. QI 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 23
When Strings Collide • Generate strings that will collide • Find such strings in the wild String hash. Code ayay 3009136 buzzards -931102253 ay. BZ 3009136 righto -931102253 b. Zay 3009136 snitz 109586548 b. Z 3009136 unprecludible 109586548 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 24
Concept: Inheritance • In Java, every class extends Object • Gets methods by default: . to. String, . hash. Code, . equals, and more • Inherit method + implementation • Subclass can override base class methods • Make. equals work for Point class 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 25
Work in 201 • How important are APTs? • How important are APT quizzes? • How important are assignments? • Earlier assignments, later assignments? • How important: reading and WOTO in-class • How important are reading quizzes? 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 26
Alphabetical Order • Encryption? Maybe not • https: //www 2. cs. duke. edu/csed/newapt/encryption. html • Think about high-level algorithm • Apply your algorithm to: "pop", "array", "deeds" • What do we need to do to code algorithm? • Recall: 'b' + 1 == 'c' • Recall: array['h'] is allowed, 'h' can be index 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 27
Ransomware • Anonymous headline? • https: //www 2. cs. duke. edu/csed/newapt/anonymous. html • High level ideas … • Can we create "help me"? • Create counts['e'] -- # of e's • To use, what's needed 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 28
How often does a string occur? • Strings stored in Array. List? • Call Collections. frequency(list) • If in array a rather than Array. List? Collections. frequency(Arrays. as. List(a)) • Is this efficient? Does it matter? • Can create parallel arrays or use Hash. Map • Keep count[k] # occurrences of word[k] 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 29
WOTO (correctness counts) http: //bit. ly/201 spring 19 -feb 1 -2 2/1/19 Compsci 201, Spring 2019: Collections, Hashing 30
Shafi Goldwasser • • 2012 Turing Award Winner RCS professor of computer science at MIT • Twice Godel Prize winner • Grace Murray Hopper Award • National Academy • Co-inventor of zero-knowledge proof protocols Work on what you like, what feels right, I now of no other way to end up doing creative work 9/20/17 Compsci 201, Fall 2017, Compare+Analysis 31
- Slides: 31