Intro to Data Structures and ADTs Chapter 2

Goal of Data Structures • Organize data • Facilitate efficient … – storage –

Simplicity Tradeoff • Simplicity of data organization versus • Simplicity/elegance of algorithms • Simple

Issues • Amount of data – phone book lookup (Hallsville vs. Dallas) – linear

Abstract Data Types (ADT) • Defn: collection of – related data items … together

Implementation of an ADT • Defn: storage (data) structures which – store the data

Implementation of an ADT • Successful software design uses data abstraction • We separate

Memory • 2 -state devices « bits 0 and 1 • Organized into –

Simple Data Types • The most basic form of data sequences of bits •

Simple Data Types • Boolean – values { false, true } – could be

Simple Data Types • Unsigned Integer data – non-negative unsigned integers – stored in

Sign-magnitude Representation • Save one bit (usually most significant) for sign (0 = +,

Two's Complement Representation • For nonnegative n: – Use ordinary base-two representation with leading

Two's Complement Representation • Example: – 88 1. 88 as a 16 -bit base-two

Two's Complement Representation • Works well for arithmetic computations 5 + – 6: 0000000101

Biased Representation • Add a constant bias to the number – typically 2 w-1

Biased Representation • Example -88 using w = 16 bits and bias of 215

Problems with Integer Representation • Limited Capacity — a finite number of bits •

Real Data • Types float and double in C++ • Use single precision (IEEE

Real Data • Example: 22. 625 = 10110. 1012 • Floating point form: 1.

Problems with Real Representation • Exponent overflow and underflow • Round off error –

Problems with Real Representation • Round off error may be compounded in a sequence

C-Style Data Structures Arrays • Single dimension • Multi dimension int num. List [30];

Arrays • Arrays as parameters Note you must specify number of elements used –

Problems with C-Style Arrays • Capacity cannot change. • Solution 1 (non-OOP) Later –

Problems with C-Style Arrays • Virtually no predefined operations for non-char arrays. • The

Basic Principle of OOP: • An object should be autonomous (self-contained) • Should carry

Aggregate Data Types • Predefined types not always adequate to model the problem –

Structures • Characteristics – has a fixed size – is ordered – elements may

FAQs about Structures • structs can be nested (can contain struct objects) • Access

A commercial for OOP: Two programming paradigms • Procedural: ( C, FORTRAN, and Pascal

Slides: 32

Download presentation

Intro to Data Structures and ADTs Chapter 2 1

Goal of Data Structures • Organize data • Facilitate efficient … – storage – retrieval – manipulation of data • Select and design appropriate data types • This is the real essence of OOP 2

Simplicity Tradeoff • Simplicity of data organization versus • Simplicity/elegance of algorithms • Simple (unsophisticated) data structure – may require much work for processing data. • More complex data organization – May yield nicer algorithms for the basic operations 3

Issues • Amount of data – phone book lookup (Hallsville vs. Dallas) – linear search? • Number of accesses and use required – compiler's lookup of an identifier's type in a symbol table – linear, binary, hash table? • Static vs. dynamic nature of the data – consider a text processor – array, vector? 4

Abstract Data Types (ADT) • Defn: collection of – related data items … together with – an associated set of operations • Why "abstract? " – Data, operations, and relations are studied independent of implementation. • What not how is the focus. 5

Implementation of an ADT • Defn: storage (data) structures which – store the data items … and – algorithms for the basic operations • These data structures are – provided in a language or – built from the language constructs (user defined) 6

Implementation of an ADT • Successful software design uses data abstraction • We separate the – definition of a data type – the implementation from 7

C++ Types 8

Memory • 2 -state devices « bits 0 and 1 • Organized into – bytes (8 bits) and – words (machine dependent — e. g. , 4 bytes). • Each byte (or word) has an address – to store and retrieve contents of any given memory location. 9

Simple Data Types • The most basic form of data sequences of bits • Simple data types – values are atomic — can't be subdivided – are ADTs. • Implementations have: – Storage structures: memory locations – Algorithms: system hardware/software to do basic operations. 10

Simple Data Types • Boolean – values { false, true } – could be stored in bits, usually use a byte – operations &&, || • Character – byte for ASCII, EBCDIC – 2 bytes for Unicode (Java) – operations ==, <, >, etc. using numeric code 11

Simple Data Types • Unsigned Integer data – non-negative unsigned integers – stored in base-two in a fixed number of bits • Signed integer – stored in a fixed number of bits • Representations – sign-magnitude – two's complement 12

Sign-magnitude Representation • Save one bit (usually most significant) for sign (0 = +, 1 = – ) • Use base-two representation in the other bits. ú 88 = 000001011000 ú -88 = 1 00001011000 Cumbersome for arithmetic computations 13

Two's Complement Representation • For nonnegative n: – Use ordinary base-two representation with leading (sign) bit 0 • For n < 0 1) Find w-bit base-2 representation of |n| 2) Complement each bit. 3) Add 1 14

Two's Complement Representation • Example: – 88 1. 88 as a 16 -bit base-two number 000001011000 2. Complement this bit string 111110100111 WHY? 3. Add 1 111110101000 15

Two's Complement Representation • Works well for arithmetic computations 5 + – 6: 0000000101 +1111111010 What gets done to 11111111 the bits to give this answer? 16

Biased Representation • Add a constant bias to the number – typically 2 w-1 (where w is number of bits) – then find its base-two representation • Example: 88 using w = 16 bits and bias of 215 = 32768 1. Add the bias to 88, giving 32856 2. Represent the result in base-two notation: 100001011000 17

Biased Representation • Example -88 using w = 16 bits and bias of 215 = 32768 1. Add the bias to -88, giving 32680 2. Represent the result in base-two notation: 011110101000 • Good for comparisons; so, it is commonly used for exponents in floating-point representation of reals. 18

Problems with Integer Representation • Limited Capacity — a finite number of bits • An operation can produce a value that requires more bits than maximum number allowed. This is called overflow. • None of these is a perfect representation of (mathematical) integers • Can only store a finite (sub)range of them. 19

Real Data • Types float and double in C++ • Use single precision (IEEE Floating-Point) • Store: ú sign of mantissa in leftmost bit (0 = +, 1 = – ) ú biased binary rep. of exponent in next 8 bits (bias = 127) ú bits b 2 b 3. . . b 24 mantissa in rightmost 23 bits. ú Need not store b 1 — know it's 1) 20

Real Data • Example: 22. 625 = 10110. 1012 • Floating point form: 1. 01101012 * 24 21

Problems with Real Representation • Exponent overflow and underflow • Round off error – Most reals do not have terminating binary representations. Example: 0. 7 = (0. 10110011001100. . . )2 22

Problems with Real Representation • Round off error may be compounded in a sequence of operations. – Recall the sums of calculated currency values • Be careful in comparing reals – with == and !=. – Instead use comparison for closeness if (abs (x – 12. 34) < 0. 001) … 23

C-Style Data Structures Arrays • Single dimension • Multi dimension int num. List [30]; float real. List [10]; int num. Table [3][4][5]; • All elements of same type • Elements accessed by – name and [ ] operator num. List[5] – name, offset, and dereference *(numlist + 5) • Name of the array is a pointer constant 24

Arrays • Arrays as parameters Note you must specify number of elements used – Formal parameter void do. It (int list[ ], int count); / or void to. It (int *list, int count); – Actual parameter doit (num. List, num. Used); Same call for either style of parameter list declaration 25

Problems with C-Style Arrays • Capacity cannot change. • Solution 1 (non-OOP) Later – Use a "run-time array" – Construct B to have required capacity – Copy elements of A into B – Deallocate A Solution 2 (OOP) Use vector 26

Problems with C-Style Arrays • Virtually no predefined operations for non-char arrays. • The Deeper Problem: – C-style arrays aren't self-contained. 27

Basic Principle of OOP: • An object should be autonomous (self-contained) • Should carry within itself all of the information needed to describe and operate upon itself. 28

Aggregate Data Types • Predefined types not always adequate to model the problem – When objects have multiple attributes – When objects have collections of heterogeneous elements • C++ provides structs and classes – Create new types with multiple attributes 29

Structures • Characteristics – has a fixed size – is ordered – elements may be of different size – direct access of elements by name (not index) struct Date { int month, day, year; char day. Of. Week [12]; }; 30

FAQs about Structures • structs can be nested (can contain struct objects) • Access members with – name of struct object – dot (member selector operator). – name of struct member Date today = { 3, 4, 2005, "Tuesday"); cout << today. month; 31

A commercial for OOP: Two programming paradigms • Procedural: ( C, FORTRAN, and Pascal ) • Object-oriented: ( C++, Java, and Smalltalk) – Action-oriented — concentrates on the verbs – Focuses on the nouns of problem specification – Programmers: • Identify basic tasks to solve problem • Implement actions to do tasks as subprograms (procedures/functions/ subroutines) • Group subprograms into programs/modules/libraries, • together make up a complete system for solving the problem – Programmers: • Determine objects needed for problem • Determine how they should work together to solve the problem. • Create types called classes made up of – data members – function members to operate on the data. • Instances of a type (class) called objects. 32