COEN 171 Data Abstraction and OOP Data Abstraction

COEN 171 - Data Abstraction and OOP · Data Abstraction – – – Problems with subprogram abstraction Encapsulation Data abstraction Language issues for ADTs Examples » Ada » C++ » Java – Parameterized ADTs (11. 1)

COEN 171 - Data Abstraction and OOP · Object-oriented programming – – – Components of object-oriented programming languages Fundamental properties of the object-oriented model Relation to data abstraction Design issues for OOPL Examples » » Smalltalk 80 C++ Ada 95 Java – Comparisons » C++ and Smalltalk » C++ and Ada 95 » C++ and Java – Implementation issues (11. 2)

Subprogram Problems (11. 3) · No way to selectively provide visibility for subprograms · No convenient ways to collect subprograms together to perform a set of services · Program that uses subprogram (client program) must know details of all data structures used by subprogram – client can “work around” services provided by subprogram – hard to make client independent of implementation techniques for data structures » discourages reuse · Difficult to build on and modify the services provided by subprogram · Many languages don’t provide for separately compiled subprograms

Encapsulation (11. 4) · One solution – a grouping of subprograms that are logically related that can be separately compiled – called encapsulations · Examples of encapsulation mechanisms – nested subprograms in some ALGOL-like languages » Pascal – FORTRAN 77 and C » files containing one or more subprograms can be independently compiled – FORTRAN 90, Modula-2, Modula-3, C++, Ada (and other contemporary languages) » separately compilable modules

Data Abstraction (11. 5) · A better solution than just encapsulation · Can write programs that depend on abstract properties of a type, rather than implementation · Informally, an Abstract Data Type (ADT) is a [collection of] data structures and operations on those data structures – example is floating point number » can define variables of that type » operations are predefined » representation is hidden and can’t manipulate except through built-in operations · ADT – isolates programs from the representation – maintains integrity of data structure by preventing direct manipulation

Data Abstraction (continued) (11. 6) · Formally, an ADT is a user-defined data type where – the representation of and operations on objects of the type are defined in a single syntactic unit; also, other units can create objects of the type. – the representation of objects of the type is hidden from the program units that use these objects, so the only operations possible are those provided in the type's definition. · Advantages of first restriction are same as those for encapsulation – program organization – modifiability (everything associated with a data structure is together) – separate compilation

Data Abstraction (continued) (11. 7) · Advantage of second restriction is reliability – by hiding the data representations, user code cannot directly access objects of the type – user code cannot depend on the representation, allowing the representation to be changed without affecting user code · By this definition, built-in types are ADTs – e. g. , int type in C » the representation is hidden » operations are all built-in » user programs can define objects of int type · User-defined abstract data types must have the same characteristics as built-in abstract data types

Data Abstraction (continued) · ADTs provide mechanisms to limit visibility – public part indicates what can be seen (and used from) outside » what is exported – private part describes what will be hidden from clients » made available to allow compiler to determine needed information » C++ allows specified program units access to the private information • friend functions and classes (11. 8)

Language Issues for ADTs (11. 9) · Language requirements for data abstraction – a syntactic unit in which to encapsulate the type definition. – a method of making type names and subprogram headers visible to clients, while hiding actual definitions » public/private – some primitive operations must be built into the language processor (usually just assignment and comparisons for equality and inequality) » some operations are commonly needed, but must be defined by the type designer » e. g. , iterators, constructors, destructors · Can put ADTs in PL – as a type definition extended to include operations (C++) » use directly to declare variables – as a collection of objects and operations (Ada) » may need to be instantiated before declaring variables

Language Issues for ADTs (continued) (11. 10) · Language design issues – – encapsulate a single type, or something more? what types can be abstract? can abstract types be parameterized? how are imported types and operations qualified? · Simula-67 was first language to address this issue – classes provided encapsulation, but no information hiding

Data Abstraction in Ada (11. 11) · Abstraction mechanism is the package · Each package has two pieces (can be in same or separate files) – specification » public part » private part – body » implementation of all operations exported in public part » may include other procedures, functions, type and variable declarations, which are hidden from clients • all variables are static » may provide initialization section • executed when declaration involving package is elaborated · Any type can be exported · Operations on exported types may be restricted – private (: =, =, /=, plus operations exported) – limited private (only operations exported)

Data Abstraction in Ada (continued) · Evaluation – exporting any type as private is good » cost is recompilation of clients when the representation is changed – can’t import specific entities from other packages – good facilities for separate compilation (11. 12)

Data Abstraction in C++ (11. 13) · Based on C struct type and Simula 67 classes · Class is the encapsulation device – all of the class instances of a class share a single copy of the member functions – each instance of a class has its own copy of the class data members – instances can be static, semidynamic, or explicit dynamic · Information Hiding – private clause for hidden entities – public clause for interface entities – protected clause - for inheritance

Data Abstraction in C++ (continued) (11. 14) · Constructors – functions to initialize the data members of instances – may also allocate storage if part of the object is heap-dynamic – can include parameters to provide parameterization of the objects – implicitly called when an instance is created » can be explicitly called – name is the same as the class name · Destructors – functions to cleanup after an instance is destroyed; usually just to reclaim heap storage – implicitly called when the object’s lifetime ends » can be explicitly called – name is the class name, preceded by a tilda (~)

Data Abstraction in C++ (continued) (11. 15) · Friend functions – allow access to private members to some unrelated units or functions · Evaluation – classes are similar to Ada packages for providing abstract data type – difference is packages are encapsulations, whereas classes are types

Data Abstraction in Java (11. 16) · Similar to C++ except – all user-defined types are classes – all objects are allocated from the heap and accessed through reference variables – individual entities in classes (methods and variables) have access control modifiers (public or private), rather than C++ clauses – functions can only be defined in classes – Java has a second scoping mechanism, package scope, that is used instead of friends » all entities in all classes in a package that don’t have access control modifiers are visible throughout the package

Parameterized ADTs (11. 17) · Ada generic packages may be parameterized with – type of element stored in data structure – operators among those elements · Must be instantiated before declaring variables – instantiation of generic behaves like text substitution – package BST_Integer is new binary_search _tree(INTEGER) » like text of generic package substituted here, with parameters substituted – EXCEPT references to non-local variables, etc. occur as if happen at point where generic was declared · If have multiple instantiations, need to disambiguate when declare exported types – package BST_Real is new binary_search_tree(REAL) – tree 1: BST_Integer. bst; – tree 2: BST_Real. bst;

Parameterized ADTs (continued) (11. 18) · C++ – classes can be somewhat generic by writing parameterized constructor functions – class itself may be parameterized as a templated class stack (int size) { stk_ptr = new int [size]; max_len = size - 1; top = -1; } stack (100) stk; – Java doesn’t support generic abstract data types

Object-Oriented Programming (11. 19) · The problem with Abstract Data Types is that they are static – can’t modify types or operations » except for generics/templates – means extra work to modify existing ADTs · Object-oriented programming (OOP) languages extend data abstraction ideas to – allow hierarchies of abstractions – make modifying existing abstractions for other uses very easy · Leads to new approach to programming – identify real world objects of problem domain and processing required of them – create simulations of those objects, processes, and the communication between them by modifying existing objects whenever possible

(11. 20) Object-Oriented Programming (continued) · Two approaches to designing OOPL – start from scratch (Smalltalk 1972!!) » allows cleaner design » better integration of object features » no installed base – modify an existing PL (C++, Ada 95) » can build on body of existing code » OO features usually not as smoothly integrated » backward compatibility issues of warts from initial language design

OOPL Components (11. 21) · Object: encapsulated operations plus local variables that define an object’s state – state is retained between executions – objects send and receive messages · Messages: requests from sender to receiver to perform work – can be parameterized – in pure OOL are also objects – return results · Methods: descriptions of operations to be done when a message is received · Classes: templates for objects, with methods and state variables – objects are instantiations of classes – classes are also objects (have instantiation method to create new objects)

Fundamental Properties of OO Model · Abstract Data Types – encapsulation into a single syntactic unit that includes operations and variables – also information hiding capabilities · Inheritance – fundamental defining characteristic of OOPL – classes are hierarchical » subclass/superclass or parent/derived » lower in structure inherit variables and methods of ancestor classes » can redefine those, or additional, or eliminate some – single inheritance (tree structure) or multiple inheritance (acyclic graph) » if single inheritance can talk about a root class (11. 22)

Fundamental Properties of OO Model (continued) (11. 23) · Polymorphism – special kind of dynamic binding » message to method – same message can be sent to different objects, and the object will respond properly – similar to function overloading except » overloading is static (known at compile time) » polymorphism is dynamic (class of object known at run time)

Comparison with Data Abstraction (11. 24) · Class == generic package · Object == instantiation of generic – actually, closer to instance of exported type · Messages == calls to operations exported by ADT · Methods == bodies (code) for operations exported by ADT · EXCEPT – data abstraction mechanism allows only one level of generic/instantiation – OO model allows multiple levels of inheritance – no dynamic binding of method invocation in ADTs

OOP Language Design Issues (11. 25) · Exclusivity of objects – everything is an object » elegant and pure, but slow for primitive types – add objects to complete typing system » fast for primitive types, but confusing – include an imperative style typing system for primitive types, but everything else is an object » relatively fast, and less confusion · Are subclasses subtypes? – does an “is a” relationship hold between parent and child classes?

OOP Language Design Issues (continued) (11. 26) · Interface or implementation inheritance? – if only interface of parent class is visible to subclass, interface inheritance » may be inefficient – if interface and implementation visible to subclass, implementation inheritance · Type checking and polymorphism – if overridding methods must have the same parameter types and return type, checking may be static – Otherwise need dynamic type checking, which is slow and delays error detection · Single or multiple inheritance – multiple is extremely convenient – multiple also makes the language and implementation more complex, and is less efficient

OOP Language Design Issues (continued) (11. 27) · Allocation and deallocation of objects – if all objects are allocated from heap, references to them are uniform (as in Java) – is deallocation explicit (heap-dynamic objects in C++) or implicit (Java) · Should all binding of messages to methods be dynamic? – if yes, inefficient – if none are, great loss of flexibility

Smalltalk 80 (11. 28) · Smalltalk is the prototypical pure OOPL · All entities in a program are objects – referenced by pointers · All computation is done by sending messages (perhaps parameterized by object names) to objects – message invokes a method – reply returns result to sender, or notifies that action has been done · Also incorporates graphical programming environment – program editor – compiler – class library browser » with associated classes – also written in Smalltalk » can be modified

Smalltalk 80 (continued) · Messages – object to receive message – message » method to invoke » possibly parameters · Unary messages – specify only object and method – first. Angle sin » invokes sin method of first. Angle object · Binary messages – infix order – total / 100 » sends message / 100 to object total » which invokes / method of total with parameter 100 (11. 29)

Smalltalk 80 (continued) (11. 30) · Keyword messages – indicate parameter values by specifying keywords – keywords also identify the method – first. Array at: 1 put: 5 » invokes at: put: method of first. Array with parameters 1 and 5 · Message expressions – messages may be combined in expressions » unary have highest precedence, then binary, then keyword » associate left to right » order may be specified by parentheses – messages may be cascaded » our. Pen home; up; goto: 500@500 » equivalent to our. Pen home. our. Pen up. our. Pen goto: 500@500

(11. 31) Smalltalk 80 (continued) · Assignment – object <- object – index <- index + 5 · Blocks – unnamed objects specified by [ <expressions> ] » expressions are separated by. – evaluated when they are sent the value message » always in the context of their definition – may be assigned to variables » foo <- [. . . ] · Logical loops – – blocks may contain conditions all blocks have while. True methods sends value to condition block evaluates body block if result is true [ <logical condition> ] while. True: [ <body of loop> ]

Smalltalk 80 (continued) (11. 32) · Iterative loops – all integer objects have a times. Repeat method – also have 12 times. Repeat: [. . . ] » to: do: » to: by: do: – a block is the loop body 6 to: 10 do: [. . . ] · Selection – true and false are also objects – each has if. True: , if. False: , if. True: if. False: , and If. False: if. True: methods total = 0 “returns true or false object” if. True: [. . . ] “true object executes this; false ignores” if. False: [. . . ] “false object executes this; true ignores”

Smalltalk 80 (continued) (11. 33) · Dynamic binding – when a message arrives at an object, the class of which the object is an instance is searched for a corresponding method – if not there, search superclass, etc. · Only single inheritance – every class is an offspring of the root class Object · Evaluation – simple, consistent syntax – relatively slow » message passing overhead for all control constructs » dynamic binding of message to method – dynamic binding allows type errors to be detected only at runtime

(11. 34) C++ · Essentially all of variable declaration, types, and control structures are those of C · C++ classes represent an addition to type structure of C · Inheritance – multiple inheritance allowed – classes may be stand-alone – three information hiding modes » public: everyone may access » private: no one else may access » protected: class and subclasses may access – when deriving a class from a base class, specify a protection mode » public mode: public, protected, and private are retained in subclass » private mode: everything in base class is private • may reexport public members of base class

C++ (continued) (11. 35) · Dynamic binding – C++ member functions are statically bound unless the function definition is identified as virtual – if virtual function name is called with a pointer or reference variable with the base class type, which member function to execute must be determined at run-time – pure virtual functions are set to 0 in class header » must be redefined in derived classes – classes containing a pure virtual function can never be instantiated directly » must be derived

(11. 36) Java · General characteristics – all data are objects except the primitive types – all primitive types have wrapper classes that store one data value – all objects are heap-dynamic, referenced through reference variables, and most are explicitly allocated · Inheritance – single inheritance only » but implementing interface can provide some of the benefits of multiple inheritance » an interface can include only method declarations and named constants public class Clock extends Applet implements Runnable – methods can be final (can’t be overridden)

Java (continued) (11. 37) · Dynamic binding is the default – except for final methods · Package provides additional encapsulation mechanism – packages are a container for related classes – entries defined without access modifier (private, protected, public) has package scope » visible throughout package but not outside – similarly, protected entries are visible throughout package

(11. 38) Ada 95 · Type extension builds on derived types with tagged types – tag associated with type identifies particular type · Classes are packages with tagged types Package Object_Package is type Object is tagged private; procedure Draw (O: in out Object); private type Object is tagged record X_Coord, Y_Coord: Real; end record; end Object_Package;

Ada 95 (continued) (11. 39) · Then may derive a new class by using new reserved word and modifying tagged type exported · Overloading defines new methods with Object_Package; use Object_package; Package Circle_Package is type Circle is new Object with record radius: Real; end record; procedure Draw (C: in out Circle); end Circle_Package

(11. 40) Ada 95 (continued) · Derived packages form tree of classes · Can refer to type and all types beneath it in tree by type’class – Object’class – Square’class · Then use these as parameters to procedures to provide dynamic binding of procedure invocation procedure foo (OC: Object’class) is begin Area(OC); -- which Area -- determined at -- run time end foo; Object Circle Square Rectangle Ellipse

Ada 95 (continued) · Pure abstract base types are defined using the word abstract in type and subprogram definitions Package World is type Thing is abstract tagged null record; function Area(T: in Thing) return Real is abstract; end World; With World; package My_World is type Object is new Thing with record. . . end record; procedure Area(O: in Object) return Real is. . . end Area; type Circle is new Object with record. . . end record; procedure Area(C: in Circle) return Real is. . . end Area; end My_World; (11. 41)

Comparing C++ and Smalltalk · Inheritance – C++ provides greater flexibility of access control – C++ provides multiple inheritance » good or bad? · Dynamic vs. static binding – Smalltalk full dynamic binding with great flexibility – C++ allows programmer to control binding time » virtual functions, which all must return same type · Control – Smalltalk does everything through message passing – C++ provides conventional control structures (11. 42)

(11. 43) Comparing C++ and Smalltalk (continued) · Classes as types – C++ classes are types » all instances of a class are the same type, and one can legally access the instance variables of another – Smalltalk classes are not types, and the language is essentially typeless – C++ provides static type checking, Smalltalk does not · Efficiency – C++ substantially more efficient with run-time CPU and memory requirements · Elegance – Smalltalk is consistent, fundamentally object-oriented – C++ is a hybrid language in which compatibility with C was an essential design consideration

Comparing C++ and Ada 95 (11. 44) · Ada 95 has more consistent type mechanism – C++ has C type structure, plus classes · C++ provides cleaner multiple inheritance · C++ must make dynamic/static function invocation decision at time root class is defined – must be virtual function – Ada 95 allows that decision to be made at time derived class is defined · C++ allows dynamic binding only for pointers and reference types · Ada 95 doesn’t provide constructor and destructor functions – must be explicitly invoked

Comparing C++ and Java (11. 45) · Java more consistent with OO model – all classes must descend from Object · No friend mechanism in Java – packages provide cleaner alternative · Dynamic binding “normal” way of binding messages to methods · Java allows single inheritance only – but interfaces provide some of the same capability as multiple inheritance

Implementing OO Constructs (11. 46) · Store state of an object in a class instance record – template known at compile time – access instance variables by offset – subclass instantiates CIR from parent before populating local instance variables · CIR also provides a mechanism for accessing code for dynamically bound methods – CIR points to table (virtual method table) which contains pointers to code for each dynamically bound method
- Slides: 46