Data Abstraction CS 201 j Engineering Software Nathanael

Data Abstraction CS 201 j: Engineering Software Nathanael Paul University of Virginia nate@virginia. edu Computer Science

Overview • Data abstraction • Specification/Design of Abstract Data Types (ADTs) • Implementation of ADTs

The Problem • Programs are complex. – Windows XP: ~45 million lines of code – Mathematica: over 1. 5 million • Abstraction helps – Many-to-one – “forget the details” – Must separate “what” from “how”

Information Hiding • Modularity - Procedural abstraction – By specification • Locality • Modifiability – By parameterization • Data Abstraction – What you can do with the data is separated from how it is represented

Software development cycle • • • Specifications – What do you want to do? Design – How will you do what you want? Implement – Code it. Test – Check if it works. Maintain – School projects don’t usually make it this far. Bugs are cheaper earlier in the cycle!

Database Implementation • Database on library web-server stores information on users: user. ID, name, email, etc. • You are responsible for implementing the interface between the web-server and database – What happens when we ask for the email address for a specific user?

Client asks for email address Server What is email address of nate? Database Client

Client/Server/Database Interaction I need Nate’s email. Server Database The interaction between the server and database is your part. Client

Client/Server/Database Interaction nate@virginia. edu Server Database Client

Example: Database System • Need a new data type • Abstract Data Types (ADTs) – Help separate what from how – Client will use the specifications for interaction with data – Client of the web database should not know the “guts” of the implementation

Data abstraction in Java • An ADT is defined by a class – The ADT in the web/database application will be a User – A private instance variable hides the class internals – public String get. Email (); • What is private in the implementation? • OVERVIEW, EFFECTS, MODIFIES – A class does not provide data abstraction by itself

Class User { // OVERVIEW: // mutable object // where the User Accessibility /* Client code using a User object, my. User */ // is a library String nate. Email = my. User. email; // member. send. Email(nate. Email); public String email; … } /* The client’s code can only see what is made public in the User class. The user’s email data is public in the User class. This is BAD. */

Program Maintenance • Suppose storage space is at a premium – Everyone in the database is userid@virginia. edu, so we can drop the virginia. edu nate@virginia. edu nate – What kind of problems will occur with the code just seen?

Program Maintenance • Suppose storage space is at a premium – Everyone in the database is userid@virginia. edu, so we can drop the virginia. edu nate@virginia. edu nate – What kind of problems could occur had the client code been able to access the email address directly? Email was public in User class. String nate. Email = my. User. email; send. Email(nate. Email); ***ERROR!!!***

Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. private String email; // Client code using a User object, my. User … String nate. Email = my. User. get. Email(); public String send. Email(nate. Email); get. Email() { // EFFECTS: returns user’s // primary email return email; } /* This code properly uses data abstraction when returning the full email address. */

Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. // Client code using a User object, my. User private String email; String nate. Email = my. User. get. Email(); … send. Email(nate. Email); public String get. Email() { // EFFECTS: returns user’s // primary email return email +“@virginia. edu”; /* The database dropped the @virginia. edu, and only one line of code needed changing. */

Advantages/Disadvantages of Data Abstraction? - More code to write and maintain initially - Overhead of calling a method - Greater initial time investment + Client doesn’t need to know about representation + Maintenance is easier. + Increases locality and modifiability

Specifying ADTs

Bad Users at the Library • The library now wants to crack down on bad Users with overdue books, so the code will need to work with a group of Users. • What should be used to represent the group? What data structures do we know about? How should we integrate this code with what we have? • What operations should be supported? – delete. User(String user. ID); – is. In. Group(String user. ID);

Library keeping track of “bad” people • You need to write some code that will manipulate a group of Users that are on the “bad” list. • Implementation at right uses an array Class Group. Users { // OVERVIEW: // Operations provided // to manage a mutable group // of users private User [] late. People; … public void to. String() { // OVERVIEW: Print user // names to standard output … } }

Array implementation initialization for Group. Users Class Group. Users { // OVERVIEW: Unbounded, mutable // group of Users private User [] late. People; … public void Group. Users(String [ ] user. IDs) { // OVERVIEW: Initialize group // from user. IDs late. People = new User[user. IDs. length + 10]; for(int i = 0; i < user. IDs. length; i++) { late. People[i] = new User(user. IDs[i]); } } }

ADT design • Mutable/Immutable ADTs – Mutable – object’s fields or values change – Immutable – object’s fields permanently set at creation – Is this being modified? • Tradeoffs • Immutability simpler and safer • Immutability is slower (creation/deletion of objects)

Classification of ADT operations • Creator (constructor) – Group. Users(String user. IDs[ ]) • Producer – add. User(String user. ID) • Mutator – set. User. Email(String email) • Observer – is. Member (String user. ID)

Implementing ADTs

A bad implementation • Most common characteristics – Modifying implementation forces other code to be changed (violdates modifiability) – Must understand more code than necessary to reason about code (violates locality) – Maintenance is difficult

A good implementation • User class needed a way to store state of a user, so operations will build around the stored state. • Methods should be (procedure abstraction): – Easily coded as possible – Efficient – Exhibit locality – Should enable better testing, maintenance

Changing the group implementation • The “guts” of the implementation is subject to change. • What happens on the Group. User’s delete. User(String user. ID)?

delete. User(String user. ID) • The array must shift down an average of n/2 items when deleting an element <user> X <user> <user>

Linked Lists A new data structure Each User has its own representation, but we store the collection in a list. In the following implementation, each user object is contained in a Node object. Head User 1 User 2 User 3 X

List-node implementation class Node { // OVERVIEW: // Mutable nodes that is used for a linked list // of users private User the. User; private Node next; next points to the … next “bad” user } late. People User 1 User 2 …

List implementation class Group. Users { // OVERVIEW: // Mutable, unbounded group of users private Node late. People; /* head of list */ private int num. Users; … } /* Nodes are users with an additional member field called next. The Node class was added, so the User class would not need modification. */

Adding a user into Group. Users /* in Group. Users. java */ public void add. User(User new. User) { // MODIFIES: this // EFFECTS: this_pre = this_pre U { (Node)new. User } late. People. add(new Node(new. User)); num. Users++; }

Adding a node into a group of nodes (Node. java) public void add (Node n) { // MODIFIES: this // EFFECTS: n is inserted just after this in the list // first user in list? if (this. next == null) { this. next = n; } else { n. next = this. next; this. next = n; } }

delete. User(String user. ID) cont. Head X User 1 User 2 User 1 User 3 X Head User 3 X

delete. User(String user. ID) Node. java public void delete (String user. ID) { // MODIFIES: this // EFFECTS: this_pre = this_pre – node // where node. user. ID = user. ID Node curr. Node; Node prev. Node; if(this. next == null) return; prev. Node = this; curr. Node = this. next; // continued on next slide

delete. User(String user. ID) cont. while(curr. Node. next != null) { if(user. ID. equals(curr. Node. get. User. ID())) { prev. Node. next = curr. Node. next; break; } curr. Node = curr. Node. next; prev. Node = prev. Node. next; } // user at end of list? if (curr. Node. next == null && user. ID. equals(curr. Node. get. User. ID())) { prev. Node. next = null; } }

Linked List vs. Array • Array is better for: – Accessing a randomly desired element • Linked list is better at: – Inserting – Deleting – Dynamic resizing • Users of your implementation may need to use a list or an array for efficiency, so you need an implementation that can be changed easily.

Questions?