Fundamentals of Python From First Programs Through Data

  • Slides: 64
Download presentation
Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and

Fundamentals of Python: From First Programs Through Data Structures Chapter 13 Collections, Arrays, and Linked Structures

Objectives After completing this chapter, you will be able to: • Recognize different categories

Objectives After completing this chapter, you will be able to: • Recognize different categories of collections and the operations on them • Understand the difference between an abstract data type and the concrete data structures used to implement it • Perform basic operations on arrays, such as insertions and removals of items Fundamentals of Python: From First Programs Through Data Structures 2

Objectives (continued) • Resize an array when it becomes too small or too large

Objectives (continued) • Resize an array when it becomes too small or too large • Describe the space/time trade-offs for users of arrays • Perform basic operations, such as traversals, insertions, and removals, on linked structures • Explain the space/time trade-offs of arrays and linked structures in terms of the memory models that underlie these data structures Fundamentals of Python: From First Programs Through Data Structures 3

Overview of Collections • Collection: Group of items that we want to treat as

Overview of Collections • Collection: Group of items that we want to treat as conceptual unit • Examples: – Lists, strings, stacks, queues, binary search trees, heaps, graphs, dictionaries, sets, and bags • Can be homogeneous or heterogeneous – Lists are heterogeneous in Python • Four main categories: – Linear, hierarchical, graph, and unordered Fundamentals of Python: From First Programs Through Data Structures 4

Linear Collections • Ordered by position • Everyday examples: – Grocery lists – Stacks

Linear Collections • Ordered by position • Everyday examples: – Grocery lists – Stacks of dinner plates – A line of customers waiting at a bank Fundamentals of Python: From First Programs Through Data Structures 5

Hierarchical Collections • Structure reminiscent of an upside-down tree – D 3’s parent is

Hierarchical Collections • Structure reminiscent of an upside-down tree – D 3’s parent is D 1; its children are D 4, D 5, and D 6 • Examples: a file directory system, a company’s organizational tree, a book’s table of contents Fundamentals of Python: From First Programs Through Data Structures 6

Graph Collections • Graph: Collection in which each data item can have many predecessors

Graph Collections • Graph: Collection in which each data item can have many predecessors and many successors – D 3’s neighbors are its predecessors and successors • Examples: Maps of airline routes between cities; electrical wiring diagrams for buildings Fundamentals of Python: From First Programs Through Data Structures 7

Unordered Collections • Items are not in any particular order – One cannot meaningfully

Unordered Collections • Items are not in any particular order – One cannot meaningfully speak of an item’s predecessor or successor • Example: Bag of marbles Fundamentals of Python: From First Programs Through Data Structures 8

Operations on Collections Fundamentals of Python: From First Programs Through Data Structures 9

Operations on Collections Fundamentals of Python: From First Programs Through Data Structures 9

Operations on Collections (continued) Fundamentals of Python: From First Programs Through Data Structures 10

Operations on Collections (continued) Fundamentals of Python: From First Programs Through Data Structures 10

Abstraction and Abstract Data Types • To a user, a collection is an abstraction

Abstraction and Abstract Data Types • To a user, a collection is an abstraction • In CS, collections are abstract data types (ADTs) – ADT users are concerned with learning its interface – Developers are concerned with implementing their behavior in the most efficient manner possible • In Python, methods are the smallest unit of abstraction, classes are the next in size, and modules are the largest • We will implement ADTs as classes or sets of related classes in modules Fundamentals of Python: From First Programs Through Data Structures 11

Data Structures for Implementing Collections: Arrays • “Data structure” and “concrete data type” refer

Data Structures for Implementing Collections: Arrays • “Data structure” and “concrete data type” refer to the internal representation of an ADT’s data • The two data structures most often used to implement collections in most programming languages are arrays and linked structures – Different approaches to storing and accessing data in the computer’s memory – Different space/time trade-offs in the algorithms that manipulate the collections Fundamentals of Python: From First Programs Through Data Structures 12

The Array Data Structure • Array: Underlying data structure of a Python list –

The Array Data Structure • Array: Underlying data structure of a Python list – More restrictive than Python lists • We’ll define an Array class Fundamentals of Python: From First Programs Through Data Structures 13

Random Access and Contiguous Memory • Array indexing is a random access operation •

Random Access and Contiguous Memory • Array indexing is a random access operation • Address of an item: base address + offset Index operation has two steps: Fundamentals of Python: From First Programs Through Data Structures 14

Static Memory and Dynamic Memory • Arrays in older languages were static • Modern

Static Memory and Dynamic Memory • Arrays in older languages were static • Modern languages support dynamic arrays • To readjust length of an array at run time: – Create an array with a reasonable default size at start-up – When it cannot hold more data, create a new, larger array and transfer the data items from the old array – When the array seems to be wasting memory, decrease its length in a similar manner • These adjustments are automatic with Python lists Fundamentals of Python: From First Programs Through Data Structures 15

Physical Size and Logical Size • The physical size of an array is its

Physical Size and Logical Size • The physical size of an array is its total number of array cells • The logical size of an array is the number of items currently in it • To avoid reading garbage, must track both sizes Fundamentals of Python: From First Programs Through Data Structures 16

Physical Size and Logical Size (continued) • In general, the logical and physical size

Physical Size and Logical Size (continued) • In general, the logical and physical size tell us important things about the state of the array: – If the logical size is 0, the array is empty – Otherwise, at any given time, the index of the last item in the array is the logical size minus 1. – If the logical size equals the physical size, there is no more room for data in the array Fundamentals of Python: From First Programs Through Data Structures 17

Operations on Arrays • We now discuss the implementation of several operations on arrays

Operations on Arrays • We now discuss the implementation of several operations on arrays • In our examples, we assume the following data settings: • These operations would be used to define methods for collections that contain arrays Fundamentals of Python: From First Programs Through Data Structures 18

Increasing the Size of an Array • The resizing process consists of three steps:

Increasing the Size of an Array • The resizing process consists of three steps: – Create a new, larger array – Copy the data from the old array to the new array – Reset the old array variable to the new array object • To achieve more reasonable time performance, double array size each time you increase its size: Fundamentals of Python: From First Programs Through Data Structures 19

Decreasing the Size of an Array • This operation occurs in Python’s list when

Decreasing the Size of an Array • This operation occurs in Python’s list when a pop results in memory wasted beyond a threshold • Steps: – Create a new, smaller array – Copy the data from the old array to the new array – Reset the old array variable to the new array object Fundamentals of Python: From First Programs Through Data Structures 20

Inserting an Item into an Array That Grows • Programmer must do four things:

Inserting an Item into an Array That Grows • Programmer must do four things: – Check for available space and increase the physical size of the array, if necessary – Shift items from logical end of array to target index position down by one • To open hole for new item at target index – Assign new item to target index position – Increment logical size by one Fundamentals of Python: From First Programs Through Data Structures 21

Inserting an Item into an Array That Grows (continued) Fundamentals of Python: From First

Inserting an Item into an Array That Grows (continued) Fundamentals of Python: From First Programs Through Data Structures 22

Removing an Item from an Array • Steps: – Shift items from target index

Removing an Item from an Array • Steps: – Shift items from target index position to logical end of array up by one • To close hole left by removed item at target index – Decrement logical size by one – Check for wasted space and decrease physical size of the array, if necessary • Time performance for shifting items is linear on average; time performance for removal is linear Fundamentals of Python: From First Programs Through Data Structures 23

Removing an Item from an Array (continued) Fundamentals of Python: From First Programs Through

Removing an Item from an Array (continued) Fundamentals of Python: From First Programs Through Data Structures 24

Complexity Trade-Off: Time, Space, and Arrays • Average-case use of memory for is O(1)

Complexity Trade-Off: Time, Space, and Arrays • Average-case use of memory for is O(1) • Memory cost of using an array is its load factor Fundamentals of Python: From First Programs Through Data Structures 25

Two-Dimensional Arrays (Grids) • Sometimes, two-dimensional arrays or grids are more useful than one-dimensional

Two-Dimensional Arrays (Grids) • Sometimes, two-dimensional arrays or grids are more useful than one-dimensional arrays • Suppose we call this grid table; to access an item: Fundamentals of Python: From First Programs Through Data Structures 26

Processing a Grid • In addition to the double subscript, a grid must recognize

Processing a Grid • In addition to the double subscript, a grid must recognize two methods that return the number of rows and the number of columns – Examples: get. Height and get. Width • The techniques for manipulating one-dimensional arrays are easily extended to grids: Fundamentals of Python: From First Programs Through Data Structures 27

Creating and Initializing a Grid • Let’s assume that there exists a Grid class:

Creating and Initializing a Grid • Let’s assume that there exists a Grid class: height initial fill value width • After a grid has been created, we can reset its cells to any values: Fundamentals of Python: From First Programs Through Data Structures 28

Defining a Grid Class Fundamentals of Python: From First Programs Through Data Structures 29

Defining a Grid Class Fundamentals of Python: From First Programs Through Data Structures 29

Defining a Grid Class (continued) • The method __getitem__ is all that is needed

Defining a Grid Class (continued) • The method __getitem__ is all that is needed to support the client’s use of the double subscript: Fundamentals of Python: From First Programs Through Data Structures 30

Ragged Grids and Multidimensional Arrays • In a ragged grid, there a fixed number

Ragged Grids and Multidimensional Arrays • In a ragged grid, there a fixed number of rows, but the number of columns in each row can vary • An array of lists or arrays provides a suitable structure for implementing a ragged grid • Dimensions can be added to grid definition if needed; the only limit is computer’s memory – A three-dimensional array can be visualized as a box filled with smaller boxes stacked in rows and columns • Depth, height, and width; three indexes; three loops Fundamentals of Python: From First Programs Through Data Structures 31

Linked Structures • After arrays, linked structures are probably the most commonly used data

Linked Structures • After arrays, linked structures are probably the most commonly used data structures in programs • Like an array, a linked structure is a concrete data type that is used to implement many types of collections, including lists • We discuss in detail several characteristics that programmers must keep in mind when using linked structures to implement any type of collection Fundamentals of Python: From First Programs Through Data Structures 32

Singly Linked Structures and Doubly Linked Structures An empty link • No random access;

Singly Linked Structures and Doubly Linked Structures An empty link • No random access; must traverse list • No shifting of items needed for insertion/removal • Resize at insertion/removal with no memory cost Fundamentals of Python: From First Programs Through Data Structures 33

Noncontiguous Memory and Nodes • A linked structure decouples logical sequence of items in

Noncontiguous Memory and Nodes • A linked structure decouples logical sequence of items in the structure from any ordering in memory – Noncontiguous memory representation scheme • The basic unit of representation in a linked structure is a node: Fundamentals of Python: From First Programs Through Data Structures 34

Noncontiguous Memory and Nodes (continued) • Depending on the language, you can set up

Noncontiguous Memory and Nodes (continued) • Depending on the language, you can set up nodes to use noncontiguous memory in several ways: – Using two parallel arrays Fundamentals of Python: From First Programs Through Data Structures 35

Noncontiguous Memory and Nodes (continued) • Ways to set up nodes to use noncontiguous

Noncontiguous Memory and Nodes (continued) • Ways to set up nodes to use noncontiguous memory (continued): – Using pointers (a null or nil represents the empty link as a pointer value) • Memory allocated from the object heap – Using references to objects (e. g. , Python) • In Python, None can mean an empty link • Automatic garbage collection frees programmer from managing the object heap • In the discussion that follows, we use the terms link, pointer, and reference interchangeably Fundamentals of Python: From First Programs Through Data Structures 36

Defining a Singly Linked Node Class • Node classes are fairly simple • Flexibility

Defining a Singly Linked Node Class • Node classes are fairly simple • Flexibility and ease of use are critical – Node instance variables are usually referenced without method calls, and constructors allow the user to set a node’s link(s) when the node is created • A singly linked node contains just a data item and a reference to the next node: Fundamentals of Python: From First Programs Through Data Structures 37

Using the Singly Linked Node Class • Node variables are initialized to None or

Using the Singly Linked Node Class • Node variables are initialized to None or a new Node object Fundamentals of Python: From First Programs Through Data Structures 38

Using the Singly Linked Node Class (continued) • node 1. next = node 3

Using the Singly Linked Node Class (continued) • node 1. next = node 3 raises Attribute. Error – Solution: node 1 = Node("C", node 3), or Node 1 = Node("C", None) node 1. next = node 3 • To guard against exceptions: if node. Variable != None: <access a field in node. Variable> • Like arrays, linked structures are processed with loops Fundamentals of Python: From First Programs Through Data Structures 39

Using the Singly Linked Node Class (continued) – Note that when the data are

Using the Singly Linked Node Class (continued) – Note that when the data are displayed, they appear in the reverse order of their insertion Fundamentals of Python: From First Programs Through Data Structures 40

Operations on Singly Linked Structures • Almost all of the operations on arrays are

Operations on Singly Linked Structures • Almost all of the operations on arrays are index based – Indexes are an integral part of the array structure • Emulate index-based operations on a linked structure by manipulating links within the structure • We explore how these manipulations are performed in common operations, such as: – Traversals – Insertions – Removals Fundamentals of Python: From First Programs Through Data Structures 41

Traversal • Traversal: Visit each node without deleting it – Uses a temporary pointer

Traversal • Traversal: Visit each node without deleting it – Uses a temporary pointer variable • Example: probe = head while probe != None: <use or modify probe. data> probe = probe. next – None serves as a sentinel that stops the process • Traversals are linear in time and require no extra memory Fundamentals of Python: From First Programs Through Data Structures 42

Traversal (continued) Fundamentals of Python: From First Programs Through Data Structures 43

Traversal (continued) Fundamentals of Python: From First Programs Through Data Structures 43

Searching • Resembles a traversal, but two possible sentinels: – Empty link – Data

Searching • Resembles a traversal, but two possible sentinels: – Empty link – Data item that equals the target item • Example: probe = head while probe != None and target. Item != probe. data: probe = probe. next if probe != None: <target. Item has been found> else: <target. Item is not in the linked structure> • On average, it is linear for singly linked structures Fundamentals of Python: From First Programs Through Data Structures 44

Searching (continued) • Unfortunately, accessing the ith item of a linked structure is also

Searching (continued) • Unfortunately, accessing the ith item of a linked structure is also a sequential search operation – We start at the first node and count the number of links until the ith node is reached • Linked structures do not support random access – Can’t use a binary search on a singly linked structure – Solution: Use other types of linked structures Fundamentals of Python: From First Programs Through Data Structures 45

Replacement • Replacement operations employ traversal pattern • Replacing the ith item assumes 0

Replacement • Replacement operations employ traversal pattern • Replacing the ith item assumes 0 <= i < n • Both replacement operations are linear on average Fundamentals of Python: From First Programs Through Data Structures 46

Inserting at the Beginning • Uses constant time and memory Fundamentals of Python: From

Inserting at the Beginning • Uses constant time and memory Fundamentals of Python: From First Programs Through Data Structures 47

Inserting at the End • Inserting an item at the end of an array

Inserting at the End • Inserting an item at the end of an array (append in a Python list) requires constant time and memory – Unless the array must be resized • Inserting at the end of a singly linked structure must consider two cases: – Linear in time and constant in memory Fundamentals of Python: From First Programs Through Data Structures 48

Inserting at the End (continued) Fundamentals of Python: From First Programs Through Data Structures

Inserting at the End (continued) Fundamentals of Python: From First Programs Through Data Structures 49

Removing at the Beginning Fundamentals of Python: From First Programs Through Data Structures 50

Removing at the Beginning Fundamentals of Python: From First Programs Through Data Structures 50

Removing at the End • Removing an item at the end of an array

Removing at the End • Removing an item at the end of an array (pop in a Python list) requires constant time and memory – Unless the array must be resized • Removing at the end of a singly linked structure must consider two cases: Fundamentals of Python: From First Programs Through Data Structures 51

Removing at the End (continued) Fundamentals of Python: From First Programs Through Data Structures

Removing at the End (continued) Fundamentals of Python: From First Programs Through Data Structures 52

Inserting at Any Position • Insertion at beginning uses code presented earlier • In

Inserting at Any Position • Insertion at beginning uses code presented earlier • In other position i, first find the node at position i - 1 (if i < n) or node at position n - 1 (if i >= n) • Linear time performance; constant use of memory Fundamentals of Python: From First Programs Through Data Structures 53

Inserting at Any Position (continued) Fundamentals of Python: From First Programs Through Data Structures

Inserting at Any Position (continued) Fundamentals of Python: From First Programs Through Data Structures 54

Removing at Any Position • The removal of the ith item from a linked

Removing at Any Position • The removal of the ith item from a linked structure has three cases: Fundamentals of Python: From First Programs Through Data Structures 55

Complexity Trade-Off: Time, Space, and Singly Linked Structures • The main advantage of singly

Complexity Trade-Off: Time, Space, and Singly Linked Structures • The main advantage of singly linked structure over array is not time performance but memory performance Fundamentals of Python: From First Programs Through Data Structures 56

Variations on a Link • A Circular Linked Structure with a Dummy Header Node

Variations on a Link • A Circular Linked Structure with a Dummy Header Node • Doubly Linked Structures Fundamentals of Python: From First Programs Through Data Structures 57

A Circular Linked Structure with a Dummy Header Node • A circular linked structure

A Circular Linked Structure with a Dummy Header Node • A circular linked structure contains a link from the last node back to the first node in the structure – Dummy header node serves as a marker for the beginning and the end of the linked structure Fundamentals of Python: From First Programs Through Data Structures 58

A Circular Linked Structure with a Dummy Header Node (continued) • To initialize the

A Circular Linked Structure with a Dummy Header Node (continued) • To initialize the empty linked structure: • To insert at the ith position: Fundamentals of Python: From First Programs Through Data Structures 59

Doubly Linked Structures Fundamentals of Python: From First Programs Through Data Structures 60

Doubly Linked Structures Fundamentals of Python: From First Programs Through Data Structures 60

Doubly Linked Structures (continued) To insert a new item at the end of the

Doubly Linked Structures (continued) To insert a new item at the end of the linked structure Fundamentals of Python: From First Programs Through Data Structures 61

Doubly Linked Structures (continued) Fundamentals of Python: From First Programs Through Data Structures 62

Doubly Linked Structures (continued) Fundamentals of Python: From First Programs Through Data Structures 62

Summary • Collections are objects that hold 0+ other objects – Main categories: Linear,

Summary • Collections are objects that hold 0+ other objects – Main categories: Linear, hierarchical, graph, and unordered – Collections are iterable – Collections are thus abstract data types • A data structure is an object used to represent the data contained in a collection • The array is a data structure that supports random access, in constant time, to an item by position – Can be two-dimensional (grid) Fundamentals of Python: From First Programs Through Data Structures 63

Summary (continued) • A linked structure is a data structure that consists of 0+

Summary (continued) • A linked structure is a data structure that consists of 0+ nodes – A singly linked structure’s nodes contain a data item and a link to the next node – Insertions or removals in linked structures require no shifting of data elements – Using a header node can simplify some of the operations, such as adding or removing items Fundamentals of Python: From First Programs Through Data Structures 64