Object Oriented Database Management Outline Motivation Embedding SQL
Object Oriented Database Management
Outline • • • Motivation Embedding SQL in host language Object Data Model Persistent Programming Languages Object Query Language Object-orientation in SQL
Motivation of ODBMSs Application data structures Relational representation Copy and translation Transparent ODBMS data transfer RDBMS • Complex objects in emerging DBMS applications cannot be effectively represented as records in relational model. • Representing information in RDBMSs requires complex and inefficient conversion into and from the relational model to the application programming language • ODBMSs provide a direct representation of objects to DBMSs overcoming the impedance mismatch problem
Embedded SQL • Access to database from a general purpose programming language required since: – Not all queries can be expressed in SQL --e. g. , recursive queries cannot be written in SQL. – Non declarative actions -- e. g. , printing reports cannot be done from SQL. • General purpose language in which SQL is embedded called host language. • SQL structures permitted in host language called embedded SQL. C compiler SQL+ C SQL library calls + C precompiler . o file loader SQL library object code Embedded SQL Compilation
Embedded SQL • SQL commands embedded in the host programming language • Data exchanged between host language and DBMS using cursors • SQL query passed from host language to DBMS which computes the answer set • A cursor can be viewed as a pointer into the answer set • DBMS returns the cursor to the programming language • Programming language can use the cursor to get a record at a time access to materialized answer.
Example of Embedded SQL : dname = “toy”; raise = 0. 1; EXEC SQL SELECT dnum into : dnum FROM Department WHERE dname= : dname; EXEC SQL DECLARE Emp CURSOR FOR SELECT * FROM Employee WHERE dno = : dnum FOR UPDATE EXEC SQL OPEN Emp; EXEC SQL FETCH Emp INTO : E. ssn, : E. dno, : E. name, : E. sal; while (SQLCODE == 0) { EXEC SQL UPDATE WHERE CURRENT OF CURSOR SET sal = sal * (1 + : : raise); EXEC SQL FETCH Emp INTO : E. ssn, : E. dno, : E. name, : E. sal; } EXEC SQL CLOSE CURSOR Emp /* SQL embedded in C to read the list of employees who work for the toy department and give them a 10 percent raise */
Object Oriented Database Management • Object Oriented databases have evolved along two different paths: • Persistent Object Oriented Programming Languages: (pure ODBMSs) – Start with an OO language (e. g. , C++, Java, SMALLTALK) which has a rich type system – Add persistence to the objects in programming language where persistent objects stored in databases • Object Relational Database Management Systems (SQL 3 Systems) – Extend relational DBMSs with the rich type system and user-defined functions. – Provide a convenient path for users of relational DBMSs to migrate to OO technology – All major vendors (e. g. , Informix, Oracle) will/are supporting features of SQL 3.
Object Database Management Group (ODMG) • Special interest group to develop standards that allow ODBMS customers to write portable applications • Standards include: – Object Model – Object Specification Languages • Object Definition Language (ODL) for schema definition • Object Interchange Format (OIF) to exchange objects between databases – Object Query Language • declarative language to query and update database objects – Language Bindings (C++, Java, Smalltalk) • Object manipulation language • Mechanisms to invoke OQL from language • Procedures for operation on databases and transactions
Object Model • Object: – observable entity in the world being modeled – similar to concept to entity in the E/R model • An object consists of: – attributes: properties built in from primitive types – relationships: properties whose type is a reference to some other object or a collection of references – methods: functions that may be applied to the object.
Class • Similar objects with the same set of properties and describing similar real-world concepts are collected into a class. • Class definition: interface Employee { attribute string name; attribute integer salary; attribute date-of-birth; attribute integer empid; relationship Projects works-for inverse Projects: : team; age-type age(); } Interface Projects{ attribute string name; attribute integer projid; relationship Employee team inverse Emplolyee works-for; int number-of-employees(); }
Class Extents • For each ODL class, an extent may be declared. • Extent is the current set of objects belonging to the class. – Similar notion to the relation in the relational model. – Queries in OQL refer to the extent of a class and not the class directly. interface Employee (extent Emp-set) { attribute string name; attribute integer salary; attribute date-of-birth; attribute integer empid; relationship Projects works-for inverse Projects: : team; age-type age(); }
Subclasses and Inheritance • A class can be declared to be a subclass of another class. • Subclasses inherit all the properties – attributes – relationships – methods from the superclass. Interface Married-Employee: Employees { string spouse-name; } • Substitutability: any method of superclass can be invoked over objects of any subclass (code reuse)
Class Hierarchy person employee staff student grad student assistant faculty RA TA undergrad
Multiple Inheritance • A class may have more than one superclass. • A class inherits properties fromeach of its superclasses. • There is a potential of ambiguity -- variable with same name inherited from two superclasses: – flag and error – rename variable – choose one
Object Identity • Each object has an identity which it maintains even if some or all of its attributes change. • Object identity is a stronger notion of identity than in relational DBMSs. • Identity in relational DBMSs is value based (primary key). • Identity in ODBMSs built into data model – no user specified identifier is required • OID is a similar notion as pointer in programming language • Object identifier (OID) can be stored as attribute in object to refer to another object. • References to other objects via their OIDs can result in a containment hierarchy • Note: containment hierarchy different from class hierarchy
Containment Hierarchy bicycle wheel tire rim spoke brake lever gear frame pad Links in containment hierarchy should be read as is-part-of instead of is-a
Persistence • Objects created may have different lifetimes: – transient: allocated memory managed by the programming language run-time system. • E. g. , local variables in procedures have a lifetime of a procedure execution • global variables have a lifetime of a program execution – persistent: allocated memory and stored managed by ODBMS runtime system. • Classes are declared to be persistence-capable or transient. • Different languages have different mechanisms to make objects persistent: – creation time: Object declared persistent at creation time (e. g. , in C++ binding) (class must be persistent-capable) – persistence by reachability: object is persistent if it can be reached from a persistent object (e. g. , in Java binding) (class must be persistentcapable).
Persistent Object-Oriented Programming Languages • Persistent objects are stored in the database and accessed from the programming language. • Classes declared in ODL mapped to the programming language type system (ODL binding). • Single programming language for applications as well as data management. – Avoid having to translate data to and from application programming language and DBMS • efficient implementation • less code – Programmer does not need to write explicit code to fetch data to and from database • persistent objects to programmer looks exactly the same as transient objects. • System automatically brings the objects to and from memory to storage device. (pointer swizzling).
Disadvantages of ODBMS Approach • Low protection – since persistent objects manipulated from applications directly, more changes that errors in applications can violate data integrity. • Non-declarative interface: – difficult to optimize queries – difficult to express queries • But …. . – Most ODBMSs offer a declarative query language OQL to overcome the problem. – OQL is very similar to SQL and can be optimized effectively. – OQL can be invoked from inside ODBMS programming language. – Objects can be manipulated both within OQL and programming language without explicitly transferring values between the two languages. – OQL embedding maintains simplicity of ODBMS programming language interface and yet provides declarative access.
OQL Example interface Employee { attribute string name; relationship setof(Projects) works-for inverse Projects: : team; Interface Projects{ attribute string name; relationship setof(Employee) team inverse Emplolyee works-for; int number-of-employees(); } } Select number-of-employees() From Employee e, e. works-for where name = “sharad” Find number of employees working on each project “sharad” works on
Migration of RDBMSs towards OO Technologies • SQL 3 standard incorporates OO concepts in the relational model. • A row in a table considered as an object • SQL 3 allows a type to be declared for tuples (similar to class in ODBMSs) • Relations are collection of tuples of a row type (similar to extent in ODBMSs) • Rows in a relation can refer to each other using a reference type (similar to object identity in ODBMSs) • A reference can be dereferenced to navigate among tables • Attributes in a relation can belong to abstract data types • Methods and functions (expressed in SQL as well as host programming language) can be associated with abstract data types
SQL-3 Example CREATE ROW TYPE Employee-type { name CHAR(30) works-for REF(Projects-type) } CREATE ROW TYPE Projects-type { name CHAR(30) team setof(REF(Employee-type)) } CREATE TABLE Emp OF TYPE Employee-type CREATE TABLE Project of TYPE Project-type Select works-for --> name From Emp Where name = ‘sharad’ Return name of the project sharad works for
OQL CMSC-461 Database Management Systems
OQL -- Motivation • Relational languages suffer from impedance mismatch when we try to connect them to conventional languages like C or C++. – The data models of C and SQL are radically different, e. g. C does not have relations, sets, or bags as primitive types; C is tuple-at-a-time, SQL is relation-at-a-time.
OQL -- Motivation (II) • OQL is an attempt by the OO community to extend languages like C++ with SQL-like, relation-at-a-time dictions. • OQL is query language paired with schema-definition language ODL.
OQL Types • Basic types: strings, ints, reals, etc. , plus class names. • Type constructors: – – Struct for structures. Collection types: set, bag, list, array. • Like ODL, but no limit on the number of times we can apply a type constructor. • Set(Struct()) and Bag(Struct()) play special roles akin to relations.
OQL Uses ODL as its Schema-Definition Portion • For every class we can declare an extent = name for the current set of objects of the class. – Remember to refer to the extent, not the class name, in queries.
Example • interface Bar (extent Bars) { attribute string name; attribute string addr; relationship Set<Sell> beers. Sold inverse Sell: : bar; }
Example (II) • interface Beer (extent Beers) { attribute string name; attribute string manf; relationship Set<Sell> sold. By inverse Sell: : beer; }
Example (III) • interface Sell (extent Sells) { attribute float price; relationship Bar bar inverse Bar: : beers. Sold; relationship Beer beer inverse Beer: : sold. By; }
Path Expressions • Let x be an object of class C. • If a is an attribute of C, then x. a = the value of a in the x object. • If r is a relationship of C, then x. r = the value to which x is connected by r. – Could be an object or a collection of objects, depending on the type of r. • If m is a method of C , then x. m (. . . ) is the result of applying m to x.
Examples • Let s be a variable whose type is Sell. • s. price = the price in the object s. • s. bar. addr = the address of the bar mentioned in s. – Note: cascade of dots OK because s. bar is an object, not a collection.
Example of Illegal Use of Dot • b. beers. Sold. price, where b is a Bar object. • Why illegal? Because b. beers. Sold is a set of objects, not a single object.
OQL Select-From-Where • SELECT < list of values > FROM < list of collections and typical members > WHERE < condition >
OQL Select-From-Where (II) • Collections in FROM can be: 1. Extents. 2. Expressions that evaluate to a collection. • Following a collection is a name for a typical member, optionally preceded by AS.
Example • Get the menu at Joe's. SELECT s. beer. name, s. price FROM Sells s WHERE s. bar. name = "Joe's Bar" • Notice double-quoted strings in OQL. • Result is of type Bag(Struct(name: string, price: float))
Example • • Another way to get Joe's menu, this time focusing on the Bar objects. SELECT s. beer. name, s. price FROM Bars b, b. beers. Sold s WHERE b. name = "Joe's Bar" Notice that the typical object b in the first collection of FROM is used to help define the second collection. – Typical usage: if x. a is an object, you can extend the path expression; if x. a is a collection, you use it in the FROM list.
Tailoring the Type of the Result • Default: bag of structs, field names taken from the ends of path names in SELECT clause. • Example SELECT s. beer. name, s. price FROM Bars b, b. beers. Sold s WHERE b. name = "Joe's Bar" has result type: Bag(Struct( name: string, price: real))
Rename Fields • Prefix the path with the desired name and a colon. • Example SELECT beer: s. beer. name, s. price FROM Bars b, b. beers. Sold s WHERE b. name = "Joe's Bar"
Change the Collection Type • Use SELECT DISTINCT to get a set of structs.
Example • SELECT DISTINCT s. beer. name, s. price FROM Bars b, b. beers. Sold s WHERE b. name = "Joe's Bar" • Use ORDER BY clause to get a list of structs.
Example • • • joe. Menu = SELECT s. beer. name, s. price FROM Bars b, b. beers. Sold s WHERE b. name = "Joe's Bar" ORDER BY s. price ASC = ascending (default); DESC = descending. We can extract from a list as if it were an array, e. g. cheapest = joe. Menu[1]. name;
Subqueries • Used mainly in FROM clauses and with quantifiers EXISTS and FORALL.
Example: Subquery in FROM • Find the manufacturers of the beers served at Joe's. SELECT b. manf FROM ( SELECT s. beer FROM Sells s WHERE s. bar. name = "Joe's Bar" )b
Quantifiers • Boolean-valued expressions for use in WHERE-clauses. FOR ALL x IN < collection > : < condition > EXISTS x IN < collection > : < condition > • The expression has value TRUE if the condition is true for all (resp. at least one) elements of the collection.
Example • Find all bars that sell some beer for more than $5. SELECT b. name FROM Bars b WHERE EXISTS s IN b. beers. Sold : s. price > 5. 00 • Problem How would you find the bars that only sold beers for more than $5?
Example • Find the bars such that the only beers they sell for more than $5 are manufactured by Pete's. SELECT b. name FROM Bars b WHERE FOR ALL be IN ( SELECT s. beer FROM b. beers. Sold s WHERE s. price > 5. 00 ): be. manf = "Pete's"
Extraction of Collection Elements • a) A collection with a single member: Extract the member with ELEMENT.
Example • Find the price Joe charges for Bud and put the result in a variable p. • p = ELEMENT( SELECT s. price FROM Sells s WHERE s. bar. name = "Joe's Bar" AND s. beer. name = "Bud" )
Extraction of Collection Elements (II) • b) Extracting all elements of a collection, one at a time: – – 1. Turn the collection into a list. 2. Extract elements of a list with <list name>[i].
Example • Print Joe's menu, in order of price, with beers of the same price listed alphabetically.
Example (II) • L= SELECT s. beer. name, s. price FROM Sells s WHERE s. bar. name = "Joe's Bar" ORDER BY s. price, s. beer. name; printf("Beert. Pricenn"); for(I = 1; I <= COUNT(L); i++) printf("%st%fn", L[i]. name, L[i]. price );
Aggregation • The five operators avg, min, max, sum, count apply to any collection, as long as the operators make sense for the element type.
Example • • • Find the average price of beer at Joe's. x = AVG( SELECT s. price FROM Sells s WHERE s. bar. name = "Joe's Bar" ); Note coercion: result of SELECT is technically a bag of 1 -field structs, which is identified with the bag of the values of that field.
Grouping • Recall SQL grouping, for example: SELECT bar, AVG(price) FROM Sells GROUP BY bar; • Is the bar value the "name" of the group, or the common value for the bar component of all tuples in the group?
Grouping (II) • In SQL it doesn't matter, but in OQL, you can create groups from the values of any function(s), not just attributes. – – Thus, groups are identified by common values, not name. " Example: group by first letter of bar names (method needed).
Outline of OQL Group-By Collection Defined by FROM, WHERE Group by values of function(s) Collection with function values and partition Terms from SELECT clause Output collection
Example l Find the average price of beer at each bar. SELECT bar. Name, avg. Price: AVG( SELECT p. s. price FROM partition p ) FROM Sells s GROUP BY bar. Name: s. bar. name
Example (II) l 1. Initial collection = Sells. » But technically, it is a bag of structs of the form Struct(s: s 1) Where s 1 is a Sells object. Note, the lone field is named s; in general, there are fields for all of the tuple variables in the FROM clause.
Example (II) l 2. Intermediate collection: » One function: s. bar. name maps Sells objects s to the value of the name of the bar referred to by s. » Collection is a set of structs of type: Struct{bar. Name: string, partition: Set< Struct{s: Sell} > }
Example (III) » For example: Struct(bar. Name = "Joe's Bar", partition = {s 1, …, sn}) where s 1, …, sn are all the structs with one field, named s, whose value is one of the Sells objects that represent Joe's Bar selling some beer.
Example (IV) l 3. Output collection: consists of beeraverage price pairs, one for each struct in the intermediate collection. » Type of structures in the output: Struct{bar. Name: string, avg. Price: real}
Example (V) » Note that in the subquery of the SELECT clause: SELECT bar. Name, avg. Price: AVG( SELECT p. s. price FROM partition p ) We let p range over all structs in partition. Each of these structs contains a single field named s and has a Sells object as its value. Thus, p. s. price extracts the price from one of the Sells tuples. » Typical output struct: Struct(bar. Name = "Joe's Bar", avg. Price = 2. 83)
Another, Less Typical Example Find, for each beer, the number of bars that charge a "low" price ( 2. 00) and a "high" price ( 4. 00) for that beer. l Strategy: group by three things: l » 1. The beer name, » 2. A boolean function that is true iff the price is low. » 3. A boolean function that is true iff the price is high.
The Query l SELECT beer. Name, low, high, count: COUNT(partition) FROM Beers b, b. sold. By s GROUP BY beer. Name: b. name, low: s. price <= 2. 00, high: s. price >= 4. 00
The Query (II) l 1. Initial collection: Pairs (b; s), where b is a beer, and s is a Sells object representing the sale of that beer at some bar. » Type of collection members: Struct{b: Beer, s: Sell}
2. Intermediate collection Quadruples consisting of a beer name, booleans telling whether this group is for high, low, or neither prices for that beer, and the partition for that group. l The partition is a set of structs of the type: l Struct{b: Beer, s: Sell} A typical value: Struct(b: "Bud" object, s: a Sells object involving Bud)
2. Intermediate collection (II) » Type of quadruples in the intermediate collection: Struct{ beer. Name: string, low: boolean, high: boolean, partition: Set<Struct{ b: Beer, s: Sell }> }
2. Intermediate collection (III) » Beer. Name low » Bud Bud high partition TRUE FALSE Slow FALSE TRUE Shigh FALSE Smid » where Slow Shigh, and Smid are the sets of beer-sells pairs (b; s) where the beer is Bud and s has, respectively, a low ( 2: 00), high ( 4: 00) and medium (between 2. 00 and 4. 00) price. » Note the partition with low = high = TRUE must be empty and will not appear.
3. Output collection: l The first three components of each group's struct are copied to the output, and the last (partition) is counted. The result: beer. Name low high count Bud Bud TRUE FALSE 27 14 36
SQL 3 Objects
Objects in SQL 3 • OQL extends C++ with database concepts, while SQL 3 extends SQL with OO concepts.
Objects in SQL 3 (II) • Ullman's personal opinion: the relation is so fundamental to data manipulation that retaining it as the core, as SQL 3 does, is "right. " • Systems using the SQL 3 philosophy are called object-relational.
Objects in SQL 3 (III) – All the major relational vendors have something of this kind, allowing any class to become the type of a column. • • Informix Data Blades Oracle Cartridges Sybase Plug-Ins IBM/DB 2 Extenders
Two Levels of SQL 3 Objects • 1. For tuples of relations = "row types. " • 2. For columns of relations = "types. " – But row types can also be used as column types.
References • Row types can have references. • If T is a row type, then REF(T) is the type of a reference to a T object. • Unlike OO systems, refs are values that can be seen by queries.
Example of Row Types • CREATE ROW TYPE Bar. Type ( name CHAR(20) UNIQUE, addr CHAR(20) ); • CREATE ROW TYPE Beer. Type ( name CHAR(20) UNIQUE, manf CHAR(20) );
Example of Row Types (II) • CREATE ROW TYPE Menu. Type ( bar REF(Bar. Type), beer REF(Beer. Type), price FLOAT );
Creating Tables • Row-type declarations do not create tables. – They are used in place of element lists in CREATE TABLE statements. • Example – – – CREATE TABLE Bars OF TYPE Bar. Type CREATE TABLE Beers OF TYPE Beer. Type CREATE TABLE Sells OF TYPE Menu. Type
Dereferencing • A B = the B attribute of the object referred to by reference A. • Example – Find the beers served by Joe. SELECT beer -> name FROM Sells WHERE bar -> name = 'Joe''s Bar';
OID's as Values • A row type can have a reference to itself. – Serves as the OID for tuples of that type. • Example CREATE ROW TYPE Bar. Type ( name CHAR(20), addr CHAR(20), bar. ID REF(Bar. Type) ); CREATE TABLE Bars OF TYPE Bar. Type VALUES FOR bar. ID ARE SYSTEM GENERATED
OID's as Values (II) • VALUES. . . clause forces the bar. ID of each tuple to refer to the tuple itself. Name addr Joe's Maple St. bar. ID
Example: Using References as Values • Find the menu at Joe's. SELECT Sells. beer->name, Sells. price FROM Bars, Sells WHERE Bars. name = 'Joe''s Bar' AND Bars. bar. ID = Sells. bar;
ADT's in SQL 3 • Allows a column of a relation to have a type that is a "class, " including methods. • Intended application: data that doesn't fit relational model well, e. g. , locations, signals, images, etc. • The type itself is usually a multi-attribute tuple.
ADT's in SQL 3 (II) • Type declaration: CREATE TYPE <name> ( attributes method declarations or definitions ); • Methods defined in a PL/SQL-like language.
Example CREATE TYPE Beer. ADT ( name CHAR(20), manf CHAR(20), FUNCTION new. Beer( : n CHAR(20), : m CHAR(20)) RETURNS Beer. ADT; : b Beer. ADT; /* local decl. */ BEGIN : b : = Beer. ADT(); /* built-in constructor */ : b. name : = : n; : b. manf : = : m; RETURN : b; END; FUNCTION get. Min. Price(: b Beer. ADT) RETURNS FLOAT; );
Example (II) • get. Min. Price is declaration only; new. Beer is definition. • get. Min. Price must be defined somewhere relation Sells is available.
Example (III) • FUNCTION get. Min. Price(: b Beer. ADT) RETURNS FLOAT; : p FLOAT; BEGIN SELECT MIN(price) INTO : p FROM Sells WHERE beer->name = : b. name; RETURN : p; END;
Built-In Comparison Functions • We can define for each ADT two functions EQUAL and LESSTHAN that allow values of this ADT to participate in WHERE clauses involving =, <=, etc.
Example: A "Point" ADT • CREATE TYPE Point ( x FLOAT, y FLOAT, FUNCTION EQUALS( : p Point, : q Point ) RETURNS BOOLEAN; BEGIN IF : p. x = : q. x AND : p. y = : q. y THEN RETURN TRUE ELSE RETURN FALSE; END;
Example: A "Point" ADT (II) • FUNCTION LESSTHAN( : p Point, : q Point ) RETURNS BOOLEAN; BEGIN IF : p. x > : q. x THEN RETURN FALSE ELSIF : p. x < : q. x THEN IF : p. y <= : q. y THEN RETURN TRUE ELSE RETURN FALSE ELSE /* : p. x = : q. x IF : p. y < : q. y THEN RETURN TRUE ELSE RETURN FALSE END; );
Using the Comparison Functions • Here is a query that computes the lower convex hull of a set of points. • Assumes My. Points(p) is a relation with a single column p of type Point. – SELECT p FROM My. Points WHERE NOT p > ANY My. Points;
Using the Comparison Functions (II)
- Slides: 93