Data models Relational object and semistructured Types of
Data models Relational, object, and semistructured
Types of database applications Queries No queries Payroll Word processing Simple data GIS CAD/CAM Complex data
An example schema COMPANY head_office PERSON address ceo departments ADDRESS ISA office boss EMPLOYEE DEPARTMENT EMPLOYEEs
A relational schema z COMPANY(CID, Name, Street, City, CEO) z DEPARTMENT(CID, Dept. Name, Street, City, Boss) z PERSON(SS#, Name, Street, City) z EMPLOYEE(SS#, Salary) z EMPLOYMENT(CID, Dept. Name, Emp) COMPANY. CEO << EMPLOYEE. SS# DEPARTMENT. CID << COMPANY. CID EMPLOYEE. SS# << PERSON. SS# EMPLOYMENT. (CID, Dept. Name) << DEPARTMENT. (CID, Dept. Name) EMPLOYMENT. Emp << EMPLOYEE. SS#
Problems with the schema z Composite attributes do not exist Head office cannot be described directly but is decomposed into Street, City z Set valued attributes do not exist The fact that people are employed at departments is described in an extra table
Problems with the schema z Generalisation does not exist The generalisation relationship between EMPLOYEE and PERSON is not explicit z Artificial identifiers required CID is an artificial identifier for COMPANY
An SQL query Are there employees who live at the same address, work at departments in London and earn more than 20000? SELECT A. SS#, B. SS# FROM EMPLOYEE AS A, EMPLOYEE AS B, EMPLOYMENT AS EMPL 1, EMPLOYMENT AS EMPL 2, PERSON AS P 1, PERSON AS P 2 DEPARTMENT AS D 1, DEPARTMENT AS D 2 WHERE A. SS# = P 1. SS# AND B. SS# = P 2. SS# AND P 1. Street = P 2. Street AND P 1. City = P 2. City AND A. Salary > 20000 AND B. Salary > 20000 AND EMPL 1. SS# = A. SS# AND EMPL 2. SS# = B. SS# AND EMPL 1. CID = D 1. CID AND EMPL 2. CID = D 2. CID AND EMPL 1. Dept. Name = D 1. Dept. Name AND EMPL 2. Dept. Name = D 2. Dept. Name AND D 1. City = London AND D 2. City = London
An object database schema COMPANY Name: String Head office: Address Departments: {Department} CEO: Employee DEPARTMENT Name: String Office: Address Boss: Employees: {Employee} ADDRESS Street: City: String PERSON SS#: Name: Address: String Address EMPLOYEE ISA PERSON Salary: Integer Boss: Employee Departments: {Department}
An OO query Are there employees who live at the same address, work at departments in London and earn more than 20000? SELECT e 1. ss#, e 2. ss# FROM e 1, e 2 in EMPLOYEE; d 1, d 2 in DEPARTMENT WHERE e 1 in d 1. Employees AND e 2 in d 2. Employees AND e 1. Address = e 2. Address AND e 1. Salary > 20000 AND e 2. Salary > 20000 AND f 1. Office. City = London AND f 2. Office. City = London
OODB models z. Complex types z. Types and classes z. Object identity z. Inheritance
Complex types z Base types String Integer Boolean z Constructors Tuple Set Bag List Array
Complex types - an example z COORDINATE The type X: Integer constructors Y: Integer can be Z: Integer applied to any type - cf. z PARKING SPOT the Position: COORDINATE relational Occupied: Boolean model z CAR PARK Spots: ARRAY OF PARKING SPOT Personnel: SET OF EMPLOYEE
Relation schemes and relations PERSON Relation scheme Relation SS# Name Age 650101 -2288 750203 -3133 500107 -5532 800515 -0044 Eva Svensson 33 Per Jonsson 23 Sven Olsson 47 Pia Eriksson 17 Salary 25000 20000 25000 18000
Types and classes z A type is a time independent description of a set of (base or complex) values z A class has a time dependent extension that is a set of object identifiers
Object identity z. Every object is identified by an object identifier z. The object identifier does not change during the life span of the object z. Two different objects with different object identifiers may have the same values OID: 1088732 Name: Simba Age: 11 Weight: 260 OID: 1293301 Name: Simba Age: 11 Weight: 260
Object identity The relational model Lion(Name, Age, Weight) (Simba, 11, 260) OO-model LION Name: String Age: Integer Weight: Integer 1088732 OID: 1088732 Name: Simba Age: 11 Weight: 260
Object identity Why object identifiers are better than keys: • Key attributes not stable • Key values not stable • Object sharing Object identifiers can be used in relational databases but must be maintained by the users
Inheritance Employees are persons The relational model OO model PERSON(SS#, Name, City) ANSTÄLLD(SS#, Salary) PERSON SS#: Nam. E: CITY: EMPLOYEE. SS 3 << PERSON. SS# String EMPLOYEE ISA PERSON Salary: Integer
Inheritance in the relational model PERSON(SS#, Name, City) EMPLOYEE(Emp#, SS#, Salary) PERSON(SS#, Name, City) EMPLOYEE(SS#, Spouse#, Salary) EMPLOYEE. SS# << PERSON. SS# EMPLOYEE. Spouse# << PERSON. SS# PERSON ISA EMPLOYEE PERSON spouse EMPLOYEE
Inheritance in OO databases PERSON SS#: Name: City: String EMPLOYEE ISA PERSON Emp#: String Salary: Integer PERSON ISA EMPLOYEE PERSON SS#: Name: City: String EMPLOYEE Emp#: Salary: Spouse: String Integer PERSON spouse EMPLOYEE
Extended base types create table slides ( id int, date, caption document, picture photo_CD_image); create table landmarks ( name varchar(30), location point); Find sunsets within a 20 kilometers radius around Sacramento. select id from slides P, landmarks L S where sunset (P. picture) and contains (P. caption, L. name) and L. location |20| S. location and S. name = 'Sacramento';
Types of DBMSs Queries No queries Relational OO/OR File systems OO Simple data Complex data
Why do we like types? z. Types facilitate understanding z. Types enable compact representations z. Types enable query optimisation z. Types facilitate consistency enforcement
Background assumptions for typed data z. Data stable over time z. Organisational body to control data z. Exercise: Give an example of a context where these assumptions do not hold
Semistructured data is schemaless and self describing The data and the description of the data are integrated
Label-value pairs z {name: “John”, tel: 112233, email: “john@123. edu”} Label Value name “John” tel email 112233 “john@123. edu”
Nested label-value pairs {name: {first: “John”, last: “Smith”}, tel: 112233, email: “john@123. edu”} name tel email 112233 “john@123. edu” first last “John” “Smith”
Duplicate labels {name: {first: “John”, last: “Smith”}, tel: 112233, tel: 445566, tel: 778899, email: “john@123. edu”}
Representing variations {person: {name: “John”, tel: 112233, email: “john@123. edu”}, age: 33, email: “john@123. edu”}, tel: 112233, tel: 332211}}
Representing relational data CAR PERSON Regno Manuf Weight AA 11 CC 33 EE 55 Volvo Saab 1100 900 1000 Ssno 1122 2233 3344 City London Paris Berlin Weight 80 70 80 {CAR: {row: {Regno: “AA 11”, Manuf: “Volvo”, Weight: 1100}, {row: {Regno: “CC 33”, Manuf: “Volvo”, Weight: 900}, {row: {Regno: “EE 55”, Manuf: “Saab”, Weight: 1000}}, PERSON: {row: {Ssno: “ 1122”, City: “London”, Weight: 80}, {row: {Ssno: “ 2233”, City: “Paris”, Weight: 70}, {row: {Ssno: “ 3344”, City: “Berlin”, Weight: 80}}}
An object graph person child &o 1 name “Eva” &o 2 age 40 name “Abel” age 20 This graph represents two people, Eva and Abel, where Abel is the child of Eva. &o 1 and &o 2 are object identifiers denoting the people Eva and Abel.
Representing objects person child &o 1 name “Eva” &o 2 age 40 name “Abel” {person: &o 1{name: “Eva”, age: 40, child: &o 2}, person: &o 2{name: “Abel”, age: 20}} age 20 An object identifier, such as &o 1, before a structure, binds the object identifier to the identity of that structure. The object identifier can then be used to refer to the structure.
An exercise a a &o 1 &o 2 b c &o 3 d “hello” a a &o 1 &o 2 b c &o 3 &o 4 d d “hello”
Course goals z. The course will familiarise the student with: · database models and query languages with respect to expressiveness and usability · theory and principles of object data bases · semistructured data and its applications · application areas for object and relational databases · interactive and embedded query languages · non-traditional data types and their management in databases
Lectures 1. Introduction to data models 2. Query languages for relational databases 3. Models and query languages for object databases 4. Models and query languages for semistructured data, XML 5. Embedded query languages
- Slides: 35