Chapter 26 ObjectOriented DBMSs Concepts and Design Transparencies

Chapter 26 Object-Oriented DBMSs – Concepts and Design Transparencies © Pearson Education Limited 1995, 2005 1

Chapter 26 - Objectives u Framework for an OODM. u Basics of the FDM. u Basics of persistent programming languages. u Main points of OODBMS Manifesto. u Main strategies for developing an OODBMS. u Single-level v. two-level storage models. u Pointer swizzling. u How an OODBMS accesses records. u Persistent schemes. © Pearson Education Limited 1995, 2005 2

Chapter 26 - Objectives u Advantages and disadvantages of orthogonal persistence. u Issues underlying OODBMSs. u Advantages and disadvantages of OODBMSs. © Pearson Education Limited 1995, 2005 3

Object-Oriented Data Model No one agreed object data model. One definition: Object-Oriented Data Model (OODM) – Data model that captures semantics of objects supported in object-oriented programming. Object-Oriented Database (OODB) – Persistent and sharable collection of objects defined by an ODM. Object-Oriented DBMS (OODBMS) – Manager of an ODB. © Pearson Education Limited 1995, 2005 4

Object-Oriented Data Model u Zdonik and Maier present a threshold model that an OODBMS must, at a minimum, satisfy: – It must provide database functionality. – It must support object identity. – It must provide encapsulation. – It must support objects with complex state. © Pearson Education Limited 1995, 2005 5

Object-Oriented Data Model u Khoshafian and Abnous define OODBMS as: – OO = ADTs + Inheritance + Object identity – OODBMS = OO + Database capabilities. u Parsaye et al. gives: – High-level query language with query optimization. – Support for persistence, atomic transactions: concurrency and recovery control. – Support for complex object storage, indexes, and access methods. – OODBMS = OO system + (1), (2), and (3). © Pearson Education Limited 1995, 2005 6

Commercial OODBMSs u Gem. Stone from Gemstone Systems Inc. , u Objectivity/DB from Objectivity Inc. , u Object. Store from Progress Software Corp. , u Ontos from Ontos Inc. , u Fast. Objects from Poet Software Corp. , u Jasmine from Computer Associates/Fujitsu, u Versant from Versant Corp. © Pearson Education Limited 1995, 2005 7

Origins of the Object-Oriented Data Model © Pearson Education Limited 1995, 2005 8

Functional Data Model (FDM) u Interesting because it shares certain ideas with object approach including object identity, inheritance, overloading, and navigational access. u In FDM, any data retrieval task can viewed as process of evaluating and returning result of a function with zero, one, or more arguments. u Resulting data model is conceptually simple but very expressive. u In the FDM, the main modeling primitives are entities and functional relationships. © Pearson Education Limited 1995, 2005 9

FDM - Entities u Decomposed into (abstract) entity types and printable entity types. u Entity types correspond to classes of ‘real world’ objects and declared as functions with 0 arguments that return type ENTITY. u For example: Staff() → ENTITY Property. For. Rent() → ENTITY. © Pearson Education Limited 1995, 2005 10

FDM – Printable Entity Types and Attributes u Printable entity types are analogous to base types in a programming language. u Include: INTEGER, CHARACTER, STRING, REAL, and DATE. u An attribute is a functional relationship, taking the entity type as an argument and returning a printable entity type. u For example: staff. No(Staff) → STRING sex(Staff) → CHAR salary(Staff) → REAL © Pearson Education Limited 1995, 2005 11

FDM – Composite Attributes Name() → ENTITY Name(Staff) → NAME f. Name(Name) → STRING l. Name(Name) → STRING © Pearson Education Limited 1995, 2005 12

FDM – Relationships u Functions with arguments also model relationships between entity types. u Thus, FDM makes no distinction between attributes and relationships. u Each relationship may have an inverse relationship defined. u For example: Manages(Staff) —» Property. For. Rent Managed. By(Property. For. Rent) → Staff INVERSE OF Manages © Pearson Education Limited 1995, 2005 13

FDM – Relationships u Can u also model *: * relationships: – Views(Client) —» Property. For. Rent – Viewed. By(Property. For. Rent) —» Client INVERSE OF Views and attributes on relationships: – view. Date(Client, Property. For. Rent) → DATE © Pearson Education Limited 1995, 2005 14

FDM – Inheritance and Path Expressions u Inheritance supported through entity types. u Principle of substitutability also supported. Staff()→ ENTITY Supervisor()→ ENTITY IS-A-STAFF(Supervisor) → Staff u Derived functions can be defined from composition of multiple functions (note overloading): f. Name(Staff) → f. Name(Staff)) f. Name(Supervisor) → f. Name(IS-A-STAFF(Supervisor)) u Composition is a path expression (cf. dot notation): Supervisor. IS-A-STAFF. Name. fname © Pearson Education Limited 1995, 2005 15

FDM – Declaration of FDM Schema 16 © Pearson Education Limited 1995, 2005

FDM – Diagrammatic Representation of Schema 17 © Pearson Education Limited 1995, 2005

FDM – Functional Query Languages u Path expressions also used within a functional query. u For example: RETRIEVE l. Name(Viewed. By(Manages(Staff)))) WHERE staff. No(Staff) = ‘SG 14’ u or in dot notation: RETRIEVE Staff. Manages. Viewed. By. Name. l. Name WHERE Staff. staff. No = ‘SG 14’ © Pearson Education Limited 1995, 2005 18

FDM – Advantages u Support for some object-oriented concepts. u Support for referential integrity. u Irreducibility. u Easy extensibility. u Suitability for schema integration. u Declarative query language. © Pearson Education Limited 1995, 2005 19

Persistent Programming Languages (PPLs) Language that provides users with ability to (transparently) preserve data across successive executions of a program, and even allows such data to be used by many different programs. u In contrast, database programming language (e. g. SQL) differs by its incorporation of features beyond persistence, such as transaction management, concurrency control, and recovery. © Pearson Education Limited 1995, 2005 20

Persistent Programming Languages (PPLs) u PPLs eliminate impedance mismatch by extending programming language with database capabilities. – In PPL, language’s type system provides data model, containing rich structuring mechanisms. u In some PPLs procedures are ‘first class’ objects and are treated like any other object in language. – Procedures are assignable, may be result of expressions, other procedures or blocks, and may be elements of constructor types. – Procedures can be used to implement ADTs. © Pearson Education Limited 1995, 2005 21

Persistent Programming Languages (PPLs) u PPL also maintains same data representation in memory as in persistent store. – Overcomes difficulty and overhead of mapping between the two representations. u Addition of (transparent) persistence into a PPL is important enhancement to IDE, and integration of two paradigms provides more functionality and semantics. © Pearson Education Limited 1995, 2005 22

OODBMS Manifesto u Complex objects must be supported. u Object identity must be supported. u Encapsulation must be supported. u Types or Classes must be able to inherit from their ancestors. u Dynamic binding must be supported. u The DML must be computationally complete. © Pearson Education Limited 1995, 2005 23

OODBMS Manifesto u The set of data types must be extensible. u Data persistence must be provided. u The DBMS must be capable of managing very large databases. u The DBMS must support concurrent users. u DBMS must be able to recover from hardware/software failures. u DBMS must provide a simple way of querying data. © Pearson Education Limited 1995, 2005 24

OODBMS Manifesto u The manifesto proposes the following optional features: – Multiple inheritance, type checking and type inferencing, distribution across a network, design transactions and versions. u No direct mention of support for security, integrity, views or even a declarative query language. © Pearson Education Limited 1995, 2005 25

Alternative Strategies for Developing an OODBMS u Extend existing object-oriented programming language. – Gem. Stone extended Smalltalk. u Provide extensible OODBMS library. – Approach taken by Ontos, Versant, and Object. Store. u Embed OODB language constructs in a conventional host language. – Approach taken by O 2, which has extensions for C. © Pearson Education Limited 1995, 2005 26

Alternative Strategies for Developing an OODBMS u Extend existing database language with objectoriented capabilities. – Approach being pursued by RDBMS and OODBMS vendors. – Ontos and Versant provide a version of OSQL. u Develop a novel database data model/language. © Pearson Education Limited 1995, 2005 27

Single-Level v. Two-Level Storage Model u Traditional programming languages lack built-in support for many database features. u Increasing number of applications now require functionality from both database systems and programming languages. u Such applications need to store and retrieve large amounts of shared, structured data. © Pearson Education Limited 1995, 2005 28

Single-Level v. Two-Level Storage Model u With a traditional DBMS, programmer has to: – Decide when to read and update objects. – Write code to translate between application’s object model and the data model of the DBMS. – Perform additional type-checking when object is read back from database, to guarantee object will conform to its original type. © Pearson Education Limited 1995, 2005 29

Single-Level v. Two-Level Storage Model u Difficulties occur because conventional DBMSs have two-level storage model: storage model in memory, and database storage model on disk. u In contrast, OODBMS gives illusion of singlelevel storage model, with similar representation in both memory and in database stored on disk. – Requires clever management of representation of objects in memory and on disk (called “pointer swizzling”). © Pearson Education Limited 1995, 2005 30

Two-Level Storage Model for RDBMS © Pearson Education Limited 1995, 2005 31

Single-Level Storage Model for OODBMS © Pearson Education Limited 1995, 2005 32

Pointer Swizzling Techniques The action of converting object identifiers (OIDs) to main memory pointers. u Aim is to optimize access to objects. u Should be able to locate any referenced objects on secondary storage using their OIDs. u Once objects have been read into cache, want to record that objects are now in memory to prevent them from being retrieved again. © Pearson Education Limited 1995, 2005 33

Pointer Swizzling Techniques u Could hold lookup table that maps OIDs to memory pointers (e. g. using hashing). u Pointer swizzling attempts to provide a more efficient strategy by storing memory pointers in the place of referenced OIDs, and vice versa when the object is written back to disk. © Pearson Education Limited 1995, 2005 34

No Swizzling u Easiest implementation is not to do any swizzling. u Objects faulted into memory, and handle passed to application containing object’s OID. u OID is used every time the object is accessed. u System must maintain some type of lookup table Resident Object Table (ROT) - so that object’s virtual memory pointer can be located and then used to access object. u Inefficient if same objects are accessed repeatedly. u Acceptable if objects only accessed once. © Pearson Education Limited 1995, 2005 35

Resident Object Table (ROT) © Pearson Education Limited 1995, 2005 36

Object Referencing u Need to distinguish between resident and nonresident objects. u Most techniques variations of edge marking or node marking. u Edge marking marks every object pointer with a tag bit: – if bit set, reference is to memory pointer; – else, still pointing to OID and needs to be swizzled when object it refers to is faulted into. © Pearson Education Limited 1995, 2005 37

Object Referencing u Node marking requires that all object references are immediately converted to virtual memory pointers when object is faulted into memory. u First approach is software-based technique but second can be implemented using software or hardware-based techniques. © Pearson Education Limited 1995, 2005 38

Hardware-Based Schemes u Use virtual memory access protection violations to detect accesses of non-resident objects. u Use standard virtual memory hardware to trigger transfer of persistent data from disk to memory. u Once page has been faulted in, objects are accessed via normal virtual memory pointers and no further object residency checking is required. u Avoids overhead of residency checks incurred by software approaches. © Pearson Education Limited 1995, 2005 39

Pointer Swizzling - Other Issues u Three other issues that affect swizzling techniques: – Copy versus In-Place Swizzling. – Eager versus Lazy Swizzling. – Direct versus Indirect Swizzling. © Pearson Education Limited 1995, 2005 40

Copy versus In-Place Swizzling u When faulting objects in, data can either be copied into application’s local object cache or accessed in-place within object manager’s database cache. u Copy swizzling may be more efficient as, in the worst case, only modified objects have to be swizzled back to their OIDs. u In-place may have to unswizzle entire page of objects if one object on page is modified. © Pearson Education Limited 1995, 2005 41

Eager versus Lazy Swizzling u Moss defines eager swizzling as swizzling all OIDs for persistent objects on all data pages used by application, before any object can be accessed. u More relaxed definition restricts swizzling to all persistent OIDs within object the application wishes to access. u Lazy swizzling only swizzles pointers as they are accessed or discovered. © Pearson Education Limited 1995, 2005 42

Direct versus Indirect Swizzling u Only an issue when swizzled pointer can refer to object that is no longer in virtual memory. u With direct swizzling, virtual memory pointer of referenced object is placed directly in swizzled pointer. u With indirect swizzling, virtual memory pointer is placed in an intermediate object, which acts as a placeholder for the actual object. – Allows objects to be uncached without requiring swizzled pointers to be unswizzled. © Pearson Education Limited 1995, 2005 43

Accessing an Object with a RDBMS 44 © Pearson Education Limited 1995, 2005

Accessing an Object with an OODBMS © Pearson Education Limited 1995, 2005 45

Persistent Schemes u Consider three persistent schemes: – Checkpointing. – Serialization. – Explicit Paging. u Note, persistence can also be applied to (object) code and to the program execution state. © Pearson Education Limited 1995, 2005 46

Checkpointing u Copy all or part of program’s address space to secondary storage. u If complete address space saved, program can restart from checkpoint. u In other cases, only program’s heap saved. u Two main drawbacks: – Can only be used by program that created it. – May contain large amount of data that is of no use in subsequent executions. © Pearson Education Limited 1995, 2005 47

Serialization u Copy closure of a data structure to disk. u Write on a data value may involve traversal of graph of objects reachable from the value, and writing of flattened version of structure to disk. u Reading back flattened data structure produces new copy of original data structure. u Sometimes called serialization, pickling, or in a distributed computing context, marshaling. © Pearson Education Limited 1995, 2005 48

Serialization u Two inherent problems: – Does not preserve object identity. – Not incremental, so saving small changes to a large data structure is not efficient. © Pearson Education Limited 1995, 2005 49

Explicit Paging u Explicitly ‘page’ objects between application heap and persistent store. u Usually requires conversion of object pointers from disk-based scheme to memory-based scheme. u Two common methods for creating/updating persistent objects: – Reachability-based. – Allocation-based. © Pearson Education Limited 1995, 2005 50

Explicit Paging - Reachability-Based Persistence u Object will persist if it is reachable from a persistent root object. u Programmer does not need to decide at object creation time whether object should be persistent. u Object can become persistent by adding it to the reachability tree. u Maps well onto language that contains garbage collection mechanism (e. g. Smalltalk or Java). © Pearson Education Limited 1995, 2005 51

Explicit Paging - Allocation-Based Persistence u Object only made persistent if it is explicitly declared as such within the application program. u Can be achieved in several ways: – By class. – By explicit call. © Pearson Education Limited 1995, 2005 52

Explicit Paging - Allocation-Based Persistence u By class – Class is statically declared to be persistent and all instances made persistent when they are created. – Class may be subclass of system-supplied persistent class. u By explicit call – Object may be specified as persistent when it is created or dynamically at runtime. © Pearson Education Limited 1995, 2005 53

Orthogonal Persistence u Three fundamental principles: – Persistence independence. – Data type orthogonality. – Transitive persistence (originally referred to as ‘persistence identification’ but ODMG term ‘transitive persistence’ used here). © Pearson Education Limited 1995, 2005 54

Persistence Independence u Persistence of object independent of how program manipulates that object. u Conversely, code fragment independent of persistence of data it manipulates. u Should be possible to call function with its parameters sometimes objects with long term persistence and sometimes only transient. u Programmer does not need to control movement of data between long-term and short-term storage. © Pearson Education Limited 1995, 2005 55

Data Type Orthogonality u All data objects should be allowed full range of persistence irrespective of their type. u No special cases where object is not allowed to be long-lived or is not allowed to be transient. u In some PPLs, persistence is quality attributable to only subset of language data types. © Pearson Education Limited 1995, 2005 56

Transitive Persistence u Choice of how to identify and provide persistent objects at language level is independent of the choice of data types in the language. u Technique that is now widely identification is reachability-based. © Pearson Education Limited 1995, 2005 used for 57

Orthogonal Persistence - Advantages u Improved programmer productivity from simpler semantics. u Improved maintenance. u Consistent protection mechanisms over whole environment. u Support for incremental evolution. u Automatic referential integrity. © Pearson Education Limited 1995, 2005 58

Orthogonal Persistence - Disadvantages u Some runtime expense in a system where every pointer reference might be addressing persistent object. – System required to test if object must be loaded in from disk-resident database. u Although orthogonal persistence promotes transparency, system with support for sharing among concurrent processes cannot be fully transparent. © Pearson Education Limited 1995, 2005 59

Versions Allows changes to properties of objects to be managed so that object references always point to correct object version. u Itasca identifies 3 types of versions: – Transient Versions. – Working Versions. – Released Versions. © Pearson Education Limited 1995, 2005 60

Versions and Configurations © Pearson Education Limited 1995, 2005 61

Versions and Configurations © Pearson Education Limited 1995, 2005 62

Schema Evolution u Some applications require considerable flexibility in dynamically defining and modifying database schema. u Typical schema changes: (1) Changes to class definition: (a) Modifying Attributes. (b) Modifying Methods. © Pearson Education Limited 1995, 2005 63

Schema Evolution (2) Changes to inheritance hierarchy: (a) Making a class S superclass of a class C. (b) Removing S from list of superclasses of C. (c) Modifying order of superclasses of C. (3) Changes to set of classes, such as creating and deleting classes and modifying class names. u Changes must not leave schema inconsistent. © Pearson Education Limited 1995, 2005 64

Schema Consistency 1. Resolution of conflicts caused by multiple inheritance and redefinition of attributes and methods in a subclass. 1. 1 Rule of precedence of subclasses over superclasses. 1. 2 Rule of precedence between superclasses of a different origin. 1. 3 Rule of precedence between superclasses of the same origin. © Pearson Education Limited 1995, 2005 65

Schema Consistency 2. Propagation of modifications to subclasses. 2. 1 2. 2 Rule for propagation of modifications in the event of conflicts. Rule for modification of domains. 2. 3 © Pearson Education Limited 1995, 2005 66

Schema Consistency 3. Aggregation and deletion of inheritance relationships between classes and creation and removal of classes. 3. 1 3. 2 3. 3 3. 4 Rule for inserting superclasses. Rule for removing superclasses. Rule for inserting a class into a schema. Rule for removing a class from a schema. © Pearson Education Limited 1995, 2005 67

Schema Consistency © Pearson Education Limited 1995, 2005 68

Client-Server Architecture u Three basic architectures: – Object Server. – Page Server. – Database Server. © Pearson Education Limited 1995, 2005 69

Object Server u Distribute processing between the two components. u Typically, client is responsible for transaction management and interfacing to programming language. u Server responsible for other DBMS functions. u Best for cooperative, object-to-object processing in an open, distributed environment. © Pearson Education Limited 1995, 2005 70

Page and Database Server Page Server u Most database processing is performed by client. u Server responsible for secondary storage and providing pages at client’s request. Database Server u Most database processing performed by server. u Client simply passes requests to server, receives results and passes them to application. u Approach taken by many RDBMSs. © Pearson Education Limited 1995, 2005 71

Client-Server Architecture © Pearson Education Limited 1995, 2005 72

Architecture - Storing and Executing Methods u Two – – approaches: Store methods in external files. Store methods in database. u Benefits of latter approach: – Eliminates redundant code. – Simplifies modifications. © Pearson Education Limited 1995, 2005 73

Architecture - Storing and Executing Methods – Methods are more secure. – Methods can be shared concurrently. – Improved integrity. u Obviously, more difficult to implement. © Pearson Education Limited 1995, 2005 74

Architecture - Storing and Executing Methods © Pearson Education Limited 1995, 2005 75

Benchmarking - Wisconsin benchmark u Developed to allow comparison of particular DBMS features. u Consists of set of tests as a single user covering: – updates/deletes involving key and non-key attributes; – projections involving different degrees of duplication in the attributes and selections with different selectivities on indexed, non-index, and clustered attributes; – joins with different selectivities; – aggregate functions. © Pearson Education Limited 1995, 2005 76

Benchmarking - Wisconsin benchmark Original benchmark had 3 relations: one called Onektup with 1000 tuples, and two others called Tenktup 1/Tenktup 2 with 10000 tuples. u Generally useful although does not cater for highly skewed attribute distributions and join queries used are relatively simplistic. u Consortium of manufacturers formed Transaction Processing Council (TPC) in 1988 to create series of transaction-based test suites to measure database/TP environments. u © Pearson Education Limited 1995, 2005 77

TPC Benchmarks u TPC-A and TPC-B for OLTP (now obsolete). u TPC-C replaced TPC-A/B and based on order entry application. u TPC-H for ad hoc, decision support environments. u TPC-R for business reporting within decision support environments. u TPC-W, a transactional Web benchmark for e. Commerce. © Pearson Education Limited 1995, 2005 78

Object Operations Version 1 (OO 1) Benchmark u Intended as generic measure of OODBMS performance. Designed to reproduce operations common in advanced engineering applications, such as finding all parts connected to a random part, all parts connected to one of those parts, and so on, to a depth of seven levels. u About 1990, benchmark was run on Gem. Stone, Ontos, Object. Store, Objectivity/DB, and Versant, and INGRES and Sybase. Results showed an average 30 -fold performance improvement for OODBMSs over RDBMSs. © Pearson Education Limited 1995, 2005 79

OO 7 Benchmark u More comprehensive set of tests and a more complex databased on parts hierarchy. u Designed for detailed comparisons of OODBMS products. u Simulates CAD/CAM environment and tests system performance in area of object-to-object navigation over cached data, disk-resident data, and both sparse and dense traversals. u Also tests indexed and nonindexed updates of objects, repeated updates, and the creation and deletion of objects. © Pearson Education Limited 1995, 2005 80

Advantages of OODBMSs u Enriched Modeling Capabilities. u Extensibility. u Removal of Impedance Mismatch. u More Expressive Query Language. u Support for Schema Evolution. u Support for Long Duration Transactions. u Applicability to Advanced Database Applications. u Improved Performance. © Pearson Education Limited 1995, 2005 81

Disadvantages of OODBMSs u Lack of Universal Data Model. u Lack of Experience. u Lack of Standards. u Query Optimization compromises Encapsulation. u Object Level Locking may impact Performance. u Complexity. u Lack of Support for Views. u Lack of Support for Security. © Pearson Education Limited 1995, 2005 82