Data Structures Fundamental Data Storage Data Structures For
- Slides: 92
Data Structures Fundamental Data Storage
Data Structures • For sizeable programs, one problem that can quickly arise is that of data storage. – What is the most efficient or effective way to organize and utilize information within a program? – Quick answer – it depends on the task.
Data Structures • For some tasks, it is helpful (at minimum) and possibly necessary to have sorted data. • For other tasks, it is not necessary to note where any given piece of data is stored within a storage data structure.
Data Structures • Note: while we have seen these in passing and as examples earlier in the course, we will now examine these a little more closely.
Arrays • Possibly the most basic non-trivial data storage structure is that of the array. – We’ve already seen the notion of a “vector” that dynamically resizes. 0 1 2 3 4 5 6 7 8 9
Beyond Arrays • Note that the main structure being implemented by an array is effectively that of an ordered list. – Just like with an array, each element being stored has a specific location, which implies an ordering. 0 1 2 3 4 5 6 7 8 9
Beyond Arrays • In Java, there is an Array. List class in the java. util. * package. – This class internally uses an array and resizes it when necessary as new items are added to the conceptual underlying list. – This resizing is handled internally and automatically by the class.
Beyond Arrays • In C++, there is a vector class as part of the std namespace. – Likewise, this class internally uses an array and resizes it when necessary as new items are added to the conceptual underlying list. – This resizing is also handled internally and automatically by the class.
Beyond Arrays • However, arrays are not the only way to model a list. – Another such model is that of the linked list. (See the graphic below. )
Linked Lists • The linked list stores each data element separately and individually, allocating space for new elements whenever as they are added into the list.
Linked Lists • Adding data to the end of a linked list is trivial, as it (usually) also is for an array.
Linked Lists • Adding data in the middle of the list, or at its beginning, is (relatively) very time-consuming for an array. • For a linked list, however, it is often a much simpler operation.
Adding Elements • Remember that for an array, elements are in fixed locations. • To insert an element into the middle of an array requires moving all elements at and after the point of insertion, e. g. , insert 7 at index 3. 3 8 1 2 4 0 1 2 3 4 13 42 5 6 9 5 7 8 9
Adding Elements 13 42 3 8 1 2 4 0 1 2 3 4 5 3 8 1 7 2 4 0 1 2 3 4 5 6 9 5 7 8 9 9 5 8 9 13 42 6 7
Adding Elements • For a linked list, however, each element’s storage space is distinct and separate from the others. • New storage may be placed directly in the middle of the chain.
Adding Elements
Linked Lists • Naturally, there is the question of what these “links of the chain” actually are, or more properly, how to represent them.
Linked Lists • In their most basic and simple form… template <typename T> class Node<T> { public: T value; Node<T>* next; }
Linked Lists template <typename T> class Node<T> { public: T value; Node<T>* next; } value next
Linked Lists Remember – objects are handled by reference, so the class Node<T> doesn’t actually contain another Node<T> – just a reference to the next one in line.
Linked Lists The end of the “linked list chain” is denoted by a null reference in the last node. The “ground” symbol at the end denotes this.
Lists • Note that we now have two different ways of storing data, each of which has its own pros and cons. – Arrays • Good for adding items to the end of lists and for random access to items within the list. • Bad for cases with many additions and removals at various places within the list.
Lists • Note that we now have two different ways of storing data, each of which has its own pros and cons. – Arrays • Good for adding items to the end of lists and for random access to items within the list. • Bad for cases with many additions and removals at various places within the list.
Lists • Note that we now have two different ways of storing data, each of which has its own pros and cons. – Linked Lists • Better for adding and removing items at random locations within the list. • Bad at randomly accessing items from the list. – Note that to use a random item within the list, we must traverse the chain to find it.
Lists • Note that both of these objects fulfill the same end goal – to represent a group of objects with some implied ordering upon them. • While they meet this goal differently, their primary purpose is identical.
Templates • Templates are integral to generic programming in C++ – Template is like a blueprint – Blueprint is used to instantiate function when it is actually used in code – “Actual” types are substituted in for the “formal” types of the template
Why Templates? What is the difference between the following two functions? int compare(const string &v 1, const string &v 2) { if (v 1 < v 2) return -1; if (v 2 < v 1) return 1; return 0; } int compare(const double &v 1, const double &v 2) { if (v 1 < v 2) return -1; if (v 2 < v 1) return 1; return 0; } Only the types!
Why Templates? What if we could write the function once for any type and have the compiler just use the right types? template <typename T> int compare(const T &v 1, const T &v 2) { if (v 1 < v 2) return -1; if (v 2 < v 1) return 1; return 0; } Requires type T to have < operator
Exercise 1 • Implement the generic compare function • Implement a main() that compares two doubles, two ints, two chars, and two strings using the compare fcn. • Compile and see that it is good!
What is Going On? • Compiler sees structure when template is defined, blueprint when generic function is coded (in header) • When call to function is seen, compiler substitutes types used in invocation into blueprint and generates required code • Can’t catch many errors until invocation is seen
Abstracting Beyond Lists • We have this notion of a “list” structure, which maps its stored objects to indices. – What if we don’t actually need to have a lookup position for our stored objects? • But wait! How could we possibly iterate over the objects in a for loop?
The Iterator • Many programming languages provide objects called iterators for enumerating objects contained within data structures – C++ and Java are no exceptions – C++’s versions are defined in the <iterator> header file – (see 3. 4 – 3. 5)
The Iterator • This iterator may be used to get each contained object in order, one at a time, in a controllable manner. – It’s especially designed to work well with for loops.
The Iterator • Example code: vector<int> numbers; // omitted code initializing numbers. iterator<int> iter; for(iter = numbers. begin(); iter != numbers. end(); iter++) { cout << *iter << ‘ ’; }
The Iterator • In C++, iterators are designed to look like and act something like pointers. – The * and -> operators are overloaded to give pointer-like semantics, allowing users of the iterator object to “dereference” the object currently “referenced” by the iterator.
The Iterator • In C++, iterators are designed to look like and act something like pointers. – Note the use of operator ++ to increment the iterator to the next item • This is another way we can interact with pointers; it’s useful for iterating across an array while using pointer semantics… but keep a copy of the original around!
The Iterator vector<int> numbers; // omitted code initializing numbers. iterator<int> iter; for(iter = numbers. begin(); iter != numbers. end(); iter++) { cout << *iter << ‘ ’; }
The Iterator • C++11 (the newest edition/standard) also provides an alternate version of the for-loop which is designed to work with iterable structures and iterators • Looks like “foreach” in other languages vector<Person> structure; for(Person &p: structure) { //Code. }
The Iterator • Both the std: : vector and std: : list classes of C++ implement iterators. – begin() returns an iterator to the list’s first element – end() is a special iterator “just after” the final element of the list, useful for checking when we’re done with iteration – Use != to check for termination
Exercise 2 • Include <iterator> header • Use iterator to walk through an array you define and print out its contents • Compile and run • See that it is good
Abstracting Beyond Lists • There are many, many other techniques for storing data than the model of a list. – Such other data structures have different techniques for accessing stored data. – You have seen one in your lab exercises
Other Data Structures • Let’s move on from this idea of a “list” structure. • In particular, note how lists map their stored objects to indices (or can map an index to the stored object) – What if we don’t actually need to have a lookup position for our stored objects? – In particular, does it really need to be an integer?
Other Data Structures • There are many, many other techniques for storing data than the model of a list. – Such other data structures have different techniques for accessing and handling stored data. – These “different techniques” are often designed with a focus on different usage patterns.
Other Data Structures • A first example: arrays index their contained objects by integers. – Should integers be the only thing by which we can index an item within a collection-oriented data structure? – Think up some examples with neighbors apple bear A 113 42 cake blue red …
Maps • The interface built on this idea within Java is the Map. • Tree. Map and Hash. Map are the two prominent implementations. – The value is the object being stored within the map. – The key is the data element used as an index into the map for that value (i. e. , how you “look up” the value) – Key is like key in a database, sometimes call “tag” in associative memory
Maps • The classes built on this idea within C++ are map and unordered_map. • Sidenote – these are also not polymorphically related. – Map stores items in order of keys – Unordered map does not require keys to have order relation at all!
Maps • How would such a map work? – We could just use matching arrays for the keys and values. – However, this wouldn’t be the most efficient idea – better techniques are known.
Hash Maps • Hash maps work by converting the key to a unique integer, where possible, through a hashing function. – C++: hash maps are represented by unordered_map. – The selection of such a function is not a simple operation. • As such, the constructor takes in a hashing function as an argument, mapping each key to a nearly-unique integer.
Hash Maps • This “hash code” is then mapped into an array for storage. – Problem: the “hash code” can easily be larger than the storage array’s size. – Solution: modular arithmetic. Divide by the array’s size and use the remainder.
Hash Maps New input: (“Football”, “Will”) hash(“Football”) -2070369658 mod 7 0 i 0 1 2 3 4 5 6 Key Value “Football” “Will”
Hash Maps New input: (“Basketball”, “Billy”) hash(“Horton”) -2127646392 mod 7 -4 => 3 i 0 1 2 3 4 5 6 Key Value “Football” “Will” “Basketball” “Billy”
Hash Maps New input: (“Gymnastics”, “Rhonda”) hash(“Gymnastics”) 2068792 mod 7 5 i 0 1 2 3 4 5 6 Key Value “Football” “Will” “Basketball” “Billy” “Gymnastics” “Rhonda”
Hash Maps New input: (“Soccer”, “Becky”) hash(“Soccer”) -2026118662 mod 7 -1 => 6 i 0 1 2 3 4 5 6 Key Value “Football” “Will” “Basketball” “Billy” “Gymnastics” “Rhonda” “Soccer” “Becky”
Hash Maps • Pros: – direct, instant lookup of values, regardless of the key’s type. • Cons: – does not support sorting – requires a specialized hashing function for keys that creates a unique int for each possible key.
Map Example #include <map> #include <iterator> main() { map<string, size_t> wordcount; String word; while (cin >> word) { ++word_count[word]; // use map to look up value } for (const auto &w : word_count) { // iterator cout << w. first << “ occurs ” << w. second << ((w. second > 1) ? “ times ” : “ time ”) << endl; } exit 0; }
Exercise 3 • Include <map> header • Use unordered map – to store >= four <key, value> pairs – your choice – Look up values based on keys and print – Or code up previous example • Compile and run • See that it is good
Maps • What if we want to have the entries sorted by their keys? – It is possible to build structures that efficiently keep their data permanently sorted by key!
Binary Tree • The binary tree is an example of one structure that can accomplish this. – Think of it as a linked list, but with two links per node instead of one.
Binary Tree • The corresponding Java structure is the Tree. Map class. – It implements the Sorted. Map interface.
Binary Tree • The corresponding C++ structure, on the other hand, is the std: : map class.
Binary Tree • The “first” node of the tree is called the root. – Any key smaller than the root’s key is in the left branch. – Any key larger than the root’s key is in the right branch.
Binary Tree root 13 7 2 25 9 17 42
Binary Tree • Binary trees require the ability to compare the keys – C++ assumes that operator< has been overloaded for custom data types
Binary Tree • Of particular note with binary trees – operations on them tend to be highly recursive due to their structure. – You’ve done this in lab – twice now!
Binary Tree • Pros: – the items are always in an established, sorted order! (By key) • Pro/Con: – accesses are slower than an unordered_map, but generally faster than a list.
Questions? • You have already implemented trees
Input/Output Modeling • Certain other structures exist to model specialized, restricted input and output behavior. – Consider the usual interaction someone might have with a stack of papers. – Another possibility: the usual behavior of a group of people waiting in line… in a queue waiting to be served.
Stacks • The data structure known as a stack is a “Last In, First Out” (LIFO) structure. – That is, the last input to the structure is the first output obtained from it. – Consider a stack of papers – when searching through it, one typically starts at the top and searches downward, from newest to oldest.
Stacks c a d b b b a a a
Stacks • Stacks are a very good model for function calls. – When function A calls function B, B must complete before A resumes operation. • Similarly, if B calls C, C completes before B. – A may then call other methods before completing.
Stacks c a d b b b a a a
Stacks • Stacks are a very good model for function calls. – In fact, this is one reason why we’re examining it now. Stacks are the model of how recursion mechanically works. – In turn, recursion is necessary for operating upon many data structures.
Stacks • When debugging, the stack trace (or call stack) of a program at a given point of execution is exactly this – a description of the order of active method calls within the program. • The area of memory where function data lives is literally called the stack space.
Stacks + Math • Stacks have often been used in mathematical operations. – Some graphing calculators use what is called “Reverse Polish Notation” (RPN), which is based upon postfix operators. – Combined with a stack, this notation is much easier to program for than infix operations.
Stacks + Math • Let’s consider the following mathematical expression: 2+5*7– 6/3 • In what order do we perform the operations? – Consider trying to code something that would be able to interpret this!
Stacks + Math • Using the standard order of operations, this becomes: 2 + (5 * 7) – (6 / 3) • The postfix notation for this: 257*+63/((2 (5 7 *) +) (6 3 /) -)
Stacks + Math 2 + (5 * 7) – (6 / 3) 2 + (35) – (2) 37 – 2 35
Stacks + Math 257*+63/- • Let’s see how this facilitates getting the right answer.
Stacks + Math 257*+63/- 7 2 5 5 35 2 2 2 6 37 37
Stacks + Math 2 + (5 * 7) – (6 / 3) 2 + (35) – (6 / 3) 37 – (6 / 3)
Stacks + Math 257*+63/37 6 3 / - 3 37 6 6 2 37 37 37 35
Stacks + Math 2 + (5 * 7) – (6 / 3) 2 + (35) – (6 / 3) 37 – 2 35
Stacks + Math • Math done in “standard” (i. e, infix notation) is typically first converted to postfix notation for actual computation. – This “conversion” is known as the Shunting-yard algorithm. It’s up on Wikipedia, so feel free to take a look.
Stacks • C++ provides the std: : stack class. – This implementation is something of a “wrapper class” that uses a vector, list, or deque internally, limiting it to stack-like behavior. • We’ll see deques in a moment. • The methods push_back(), pop_back(), and back() are designed from a stack perspective.
Questions? • Home exercise – implement and use a stack
Queues • The data structure known as a queue is a “First In, First Out” (FIFO) structure. – That is, the first input to the structure is the first output obtained from it. – Consider a line of people – the person in front has priority to whatever the line is waiting on… like buying tickets at the movies or gaining access to a sports event.
Queues • Queues are significantly like lists, except that we have additional restrictions placed on them. – Additions may only happen at the list’s end. – Removals may only happen at the list’s beginning. • As a result, standard array-based behavior may not be optimal.
Queues a a a b a b c c
Queues • In C++, the queue class is provided. – This implementation is also something of a “wrapper class” that uses a list, or deque internally, limiting it to queue-like behavior. • list works well as a queue, as linked-lists can easily be altered from both ends.
Stacks + Queues • The “deque”, or double-ended queue, combines the behaviors of stacks and queues into a single structure. – Items may be added or removed at either end of the structure. – This allows for either LIFO or FIFO behavior – it’s all in how you use the structure. • Mixed behavior is also possible, so beware!
Deques • C++ defines the deque class for such uses. – This is a full-fledged object in its own right, and is array-based. • It may use multiple arrays and modular arithmetic, to allow efficient additions at the front for example. – It is the default object used internally by both stack and queue.
Questions? • Home exercise – implement and use a queue and a deque
- Primary storage vs secondary storage
- Storage devices of computer
- Secondary storage provides temporary or volatile storage
- Unified storage vs traditional storage
- Analogous structure
- Kontinuitetshantering
- Typiska novell drag
- Nationell inriktning för artificiell intelligens
- Ekologiskt fotavtryck
- Varför kallas perioden 1918-1939 för mellankrigstiden
- En lathund för arbete med kontinuitetshantering
- Kassaregister ideell förening
- Personlig tidbok för yrkesförare
- A gastrica
- Vad är densitet
- Datorkunskap för nybörjare
- Tack för att ni lyssnade bild
- Att skriva debattartikel
- För och nackdelar med firo
- Nyckelkompetenser för livslångt lärande
- Påbyggnader för flakfordon
- Formel för lufttryck
- Offentlig förvaltning
- Kyssande vind
- Presentera för publik crossboss
- Argument för teckenspråk som minoritetsspråk
- Vem räknas som jude
- Treserva lathund
- Fimbrietratt
- Bästa kameran för astrofoto
- Cks
- Byggprocessen steg för steg
- Mat för idrottare
- Verktyg för automatisering av utbetalningar
- Rutin för avvikelsehantering
- Smärtskolan kunskap för livet
- Ministerstyre för och nackdelar
- Tack för att ni har lyssnat
- Referat mall
- Redogör för vad psykologi är
- Borstål, egenskaper
- Atmosfr
- Borra hål för knoppar
- Orubbliga rättigheter
- Formel gruplar
- Tack för att ni har lyssnat
- Steg för steg rita
- Vad är verksamhetsanalys
- Tobinskatten för och nackdelar
- Toppslätskivling dos
- Modell för handledningsprocess
- Egg för emanuel
- Elektronik för barn
- Plagg i rom
- Strategi för svensk viltförvaltning
- Var 1721 för stormaktssverige
- Humanitr
- Romarriket tidslinje
- Tack för att ni lyssnade
- Multiplikation med decimaltal uppgifter
- Bunden och fri form
- Inköpsprocessen steg för steg
- Fuktmätningar i betong enlig rbk
- Etik och ledarskap etisk kod för chefer
- Kolposkopi, px
- Myndigheten för delaktighet
- Frgar
- Sju principer för tillitsbaserad styrning
- Läkarutlåtande för livränta
- Karttecken brunn
- Lek med geometriska former
- Vishnuismen
- Biologiska arvet
- Bris för vuxna
- Big brother rösta
- Fundamental sampling distributions and data descriptions
- Sample and population example
- Fundamental data analysis
- Fundamental data type
- Fundamental data type
- Fundamental data type
- Fundamental sampling distributions and data descriptions
- C for java programmers
- Influxdb data storage
- What are kilobytes
- Storage program concept
- Extensive network data storage (nds)
- Data dictionary storage
- Distributed storage in big data
- A named collection of data on a storage medium
- Bigtable a distributed storage system for structured data
- Data-centric storage
- Data storage concepts