BIS 4435 Data Management for Decision Support Lecture

BIS 4435 – Data Management for Decision Support Lecture 5 Lecture 3: Normalization and Database Query Dr. Nawaz Khan School of Computing Science E-mail: n. x. khan@mdx. ac. uk Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 1

Unit Reading Assignment Unit 5: Entity Relationship Modelling: A Traditional vs. Object Oriented Approach Lecture 5 Reading Suggestion: Connolly, T. M. , and Begg, C. E. , Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4 th Edition, ISBN: 0321210255(chapters 6) Global campus materials on OASIS: http: //oasis. mdx. ac. uk/ (unit 5) More Reading: Fundamentals of Database Systems. R. Elmasri and S. B. Navathe, 4 th Edition, 2004, Addison-Wesley, ISBN 0 -321 -12226 -7 (Chapters 10) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 2

Unit Learning Outcome Lecture 5 At the end of this unit you should be able to: Ø Describe the process, strengths and weaknesses of Data Normalisation, and so demonstrate an understanding of when and to what extent the technique should be applied in practice Ø Explain and apply the concepts of Functional Dependency and Determinant through the understanding and construction of Determinacy Diagrams Ø Describe and apply understanding of three Normal Forms for relations: § Convert Un-normalised data into First Normal Form relations, so that data items contain only single, simple values § derive Second Normal Form relations by eliminating part-key dependencies § derive Third Normal Form relations by removing transitive dependencies Ø Convert a set of relations to Boyce-Codd Normal form Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 3

Functional Dependencies and Normalization for Relational Databases Ø Outline § Introduction § Functional dependency Lecture 5 ü Direct dependency ü Indirect dependency ü Partial dependency § Normalization ü 1 NF and dependency problems ü 2 NF – solves partial dependency ü 3 NF – solves indirect dependency ü BCNF – well-normalized relations § Notes and suggestions § Summary Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 4

Functional Dependencies and Normalization for Relational Databases Introduction Ø Each relation schema consists of a number of attributes and the relational database schema consists of a number of relation schemas Lecture 5 Ø Attributes are grouped to form a relation schema Ø Need some formal measure of why one grouping of attributes into a relation schema may be better than another Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 5

Functional Dependencies and Normalization for Relational Databases Introduction Ø Top-down database design: Requirements Lecture 5 Mini-world E 1 ? R Conceptual schema Relation schemas Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk E 2 6

Functional Dependencies and Normalization for Relational Databases Introduction Ø “Goodness” measures: Lecture 5 § § Redundant information in tuples Update anomalies: modification, deletion, insertion Reducing the NULL values in tuples Disallowing the possibility of generating spurious tuples Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 7

Functional Dependencies and Normalization for Relational Databases Introduction Ø Redundant information in tuples § In EMP_DEPT, the attribute values pertaining to a particular department (DNUMBER, DNAME, DMGRSSN) are repeated for every employee who works for that department Lecture 5 Ø Update anomalies: modification, deletion, insertion § Modification ü As the manager of a dept. changes we have to update many values according to employees working for that dept. ü Easy to make the DB inconsistent Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 8

Functional Dependencies and Normalization for Relational Databases Introduction Ø Update anomalies: modification, deletion, insertion (cont. ) § Deletion: ü If Borg James E. leaves, we delete his tuple and lose: Lecture 5 – the existing of dept. 1 – the name of dept. 1 – who is the manager of dept. 1 § Insertion: ü How can we create a department before any employees are assigned to it ? ? Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 9

Functional Dependencies and Normalization for Relational Databases Introduction Ø Reducing the NULL values in tuples § Employees not assigned to any dept. : waste the storage space § Other difficulties: aggregation operations (e. g. , COUNT) and joins Ø Disallowing the possibility of generating spurious tuples Lecture 5 EMP_PROJ(SSN, PNUMBER, HOURS, ENAME, PLOCATION) EMP_LOCS(ENAME, PLOCATION) EMP_PROJ 1(SSN, PNUMBER, HOURS, PNAME, PLOCATION) Ø Generation of invalid and spurious data during JOINS: PLOCATION is the attribute that relates EMP_LOCS and EMP_PROJ 1, and PLOCATION is neither a primary key nor a foreign key in either EMP_LOCS or EMP_PROJ 1 (cf. chapter 10 [1] for more details) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 10

Functional Dependencies and Normalization for Relational Databases Introduction Ø “Goodness” measures: § § Redundant information in tuples Update anomalies: modification, deletion, insertion Reducing the NULL values in tuples Disallowing the possibility of generating spurious tuples Lecture 5 Normalization Ø It helps DB designers determine the best relation schemas § § A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes A series of normal form tests that can be carried out on individual relation schemas so that the relational database can be normalized to any desired degree Ø It is based on the concept of normal form 1 NF, 2 NF, 3 NF, BCNF, 4 NF, 5 NF Ø It is a process which ensures that the data is structured in such a way that attributes are grouped with the PK. Attributes that do not directly depend on PK may be extracted to form a new relation Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 11

Functional Dependencies and Normalization for Relational Databases Functional dependency E 2 ? Lecture 5 Ø Dependencies between different ATTRIBUTES determinant Ø Direct dependency E 1 R Ø Indirect dependency Conceptual schema Ø Partial dependency Relation schemas Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 12

Functional Dependencies and Normalization for Relational Databases Functional dependency Ø Determinant: Lecture 5 § X is a determinant of Y if each value of X has precisely one (possibly NULL) associated value of Y § Another way: X is a determinant of Y if and only if whenever two tuples agree on their X value they agree on their Y value § X determines Y or Y is functionally dependent on X § Formal definition: ü A functional dependency, denoted by X -> Y, between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuples that can form a relation state r of R. The constraint is that, for any two tuples t 1 and t 2 in r that have t 1[X] = t 2[X], we must also have t 1[Y] = t 2[Y] § Functional determinancy diagram shows the dependencies in a visual format X Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Y 13

Functional Dependencies and Normalization for Relational Databases Functional dependency Ø An example: Performer-id Performer-name Lecture 5 Performer-type Performerlocation Ø Direct dependency (fully functional dependency) § All attributes in a R must be fully functionally dependent on the primary key (or the PK is a determinant of all attributes in R) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 14

Functional Dependencies and Normalization for Relational Databases Functional dependency Ø Indirect dependency (transitive dependency) § Value of an attribute is not determined directly by the primary key Lecture 5 Performerid Performername Performertype Fee Performerlocation Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 15

Functional Dependencies and Normalization for Relational Databases Functional dependency Ø Partial dependency § Composite determinant - more than one value is required to determine the value of another attribute, the combination of values is called a composite determinant Lecture 5 EMP_PROJ(SSN, PNUMBER, HOURS, ENAME, PLOCATION) {SSN, PNUMBER} -> HOURS § Partial dependency - if the value of an attribute does not depend on an entire composite determinant, but only part of it, the relationship is known as the partial dependency SSN -> ENAME PNUMBER -> {PNAME, PLOCATION} Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 16

Functional Dependencies and Normalization for Relational Databases Functional dependency Ø Partial dependency § Another example using functional determinancy diagram: Performer-id Performer-name Lecture 5 Performer-type Performer-location Fee Agent-id Agent-name Agent-location Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 17

Functional Dependencies and Normalization for Relational Databases Ø Outline § Introduction § Functional dependency Lecture 5 ü Direct dependency ü Indirect dependency ü Partial dependency § Normalization ü 1 NF and dependency problems ü 2 NF – solves partial dependency ü 3 NF – solves indirect dependency ü BCNF – well-normalized relations § Notes and suggestions § Summary Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk BIS 4229 – Industrial Data Management Technologies 18

Functional Dependencies and Normalization for Relational Databases Normalization – 1 NF Lecture 5 Ø First normal form (1 NF) - there is only one value at the intersection of each row and column of a relation - no set valued attributes in 1 NF Ø To be part of the formal definition of a relation in the basic (flat) relational model Ø Problem with 1 NF - no primary key for the relation Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 19

Functional Dependencies and Normalization for Relational Databases Normalization – 1 NF Performer-name Performer-id Performer-location Performer-type Lecture 5 Agent-id Fee Agent-name Agent-location Venue-name 1 NF determinancy diagram Venue-id Venue-location Event-id Event-name Event-type Booking-date Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 20

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF Lecture 5 Ø Second normal form (2 NF) - all attributes must be fully functionally dependent on the primary key Ø 2 NF solves partial dependency problem in 1 NF Ø Method: identify primary keys and group attributes that relate to the key together to form separate new relations Ø Benefit of 2 NF Ø Problem with 2 NF Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 21

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF Performer-id Performername Performertype Lecture 5 Performerlocation 2 NF determinancy diagram Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Fee Relation in 2 NF P-id P-name P- type Fee P-loc’n 101 105 108 Baron Steed Jones Singer Dancer Actor 75 60 85 York Berlin Bombay 112 118 126 129 134 138 140 141 143 Eagles Markov Stokes Chong Brass Ng Strong Gomez Tan Actor Dancer Comedian Actor Singer Magician Musician Singer 85 60 90 85 75 75 72 92 75 Leeds Moscow Athens Beijing London Penang Rome Lisbon Chicago 147 Qureshi Actor 85 London 149 Tan Actor 85 Taipei 150 Pointer Magician 72 Paris 152 Peel Dancer 60 London 22

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF determinancy diagram Agent-id Agent-name Agent-location Lecture 5 Venue-id Venue-name Venue-location Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk A-id A-name A-loc’n 1295 1435 Burton Nunn Lonton Boston 1504 Lee Taipei 1682 Tsang Beijing 1460 1522 1509 1478 1377 1478 1190 1802 1076 Stritch Ellis Patel Burns Webb Burns Patel Chapel Eccles Rome Madrid York Leeds Sydney Leeds Hue Bristol Oxford 1409 Arkley York 1428 Vemon Cairo V-id V-name V-loc’n 59 35 54 17 Atlas Polis Nation Silbury Tokyo Athens Lisbon Tunis 46 Royale Cairo 75 79 28 84 82 92 62 Vostok Festive Gratton State Tower Palace Shaw Kiev Rome Boston Kiev Lima Milan Oxford Relation in 2 NF 23

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF Relation in 2 NF determinancy diagram Lecture 5 Event-type Event-id Event-name Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk E-id E-name E-type 959 907 921 942 901 913 926 952 934 945 938 981 957 963 941 964 927 971 988 978 Show Time Elgar 1 Silver Shoe White Lace The Dark What Now Next Year Gold Days Angels Trick-Treat New Dawn Birdsong Quicktime Vanish Mahler 1 The Friends Chanson Card Trick Secret Tape Swift Step Musical Concert Ballet Drama Opera Variety show Drama Musical Magic show Concert Drama Opera Magic show Drama Dance 24

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF determinancy diagram Lecture 5 Performerid Agent-id Booking-date Venue-id Event-id Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Relation in 2 NF P-id A-id V-id E-id Booking-date 101 105 108 112 112 126 129 134 138 140 141 143 1295 1435 1504 1682 1460 1522 1504 1509 1478 1504 59 35 54 79 17 46 75 59 79 28 84 17 84 79 959 921 942 901 926 952 945 926 981 957 963 941 927 25 -Nov-99 07 -Jan-02 10 -Feb-02 29 -Jul-03 13 -Aug-00 05 -May-99 16 -Mar-99 02 -Sept-01 22 -Jun-00 18 -Sept-01 18 -Aug-99 21 -Jul-00 21 -Nov-02 147 1076 17 952 30 -Apr-00 147 1409 79 988 17 -Apr-00 152 1428 59 978 01 -Oct-01 25

Functional Dependencies and Normalization for Relational Databases Normalization – 2 NF Performer-id Performername Performertype Lecture 5 Performerlocation 2 NF determinancy diagram ØProblem with 2 NF: - Insertion - Modification - Deletion Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Fee Relation in 2 NF P-id P-name P- type Fee P-loc’n 101 105 108 Baron Steed Jones Singer Dancer Actor 75 60 85 York Berlin Bombay 112 118 126 129 134 138 140 141 143 Eagles Markov Stokes Chong Brass Ng Strong Gomez Tan Actor Dancer Comedian Actor Singer Magician Musician Singer 85 60 90 85 75 75 72 92 75 Leeds Moscow Athens Beijing London Penang Rome Lisbon Chicago 147 Qureshi Actor 85 London 149 Tan Actor 85 Taipei 150 Pointer Magician 72 Paris 152 Peel Dancer 60 London 26

Functional Dependencies and Normalization for Relational Databases Normalization – 3 NF Lecture 5 Ø Third normal form (3 NF) - all attributes must be directly dependent on primary key Ø 3 NF solves indirect (transitive) dependencies problem in 1 NF and 2 NF Ø Method: identify all transitive dependencies and each transitive dependency will form a new relation, with non-prime attributes participating in the transitive dependency and the attribute which determines others as the attributes for the new relation Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 27

Functional Dependencies and Normalization for Relational Databases Normalization – 3 NF determinancy diagram Performer-id Performername Lecture 5 Performertype Performerlocation Relation in 3 NF P-id P-name P- type P-loc’n 101 105 108 112 118 126 129 134 138 140 141 143 147 149 150 152 Baron Steed Jones Eagles Markov Stokes Chong Brass Ng Strong Gomez Tan Qureshi Tan Pointer Peel York Berlin Bombay Leeds Moscow Athens Beijing London Penang Rome Lisbon Chicago London Taipei Paris London Singer Dancer Actor Dancer Comedian Actor Singer Magician Musician Singer Actor Magician Dancer P- type Fee Performertype Fee Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Singer Dancer Actor Comedian Magician Musician 75 60 85 90 72 92 28

Functional Dependencies and Normalization for Relational Databases Normalization 3 NF determinancy diagram Agent-id Agent-name Event-type Event-id Lecture 5 Agent-location Agent- Event-name location Venue-id Venue-name Performerid Agent-id Venue-location Booking-date Venue-id Event-id Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 29

Functional Dependencies and Normalization for Relational Databases Normalization - BCNF Lecture 5 Ø Boyce-Codd normal form (BCNF) was proposed as a simpler form of 3 NF, but it was found to be stricter than 3 NF, because every relation in BCNF is also in 3 NF; however, a relation in 3 NF is not necessarily in BCNF Ø BCNF: well-normalized relations Ø The formal definition of 3 NF: A relation schema R is in third normal form (3 NF) if, whenever a nontrivial functional dependency X -> A holds in R, either (a) X is a superkey of R, or (b) A is a prime attribute of R Ø The formal definition of BCNF differs slightly from the definition of 3 NF: A relation schema R is in BCNF if whenever a nontrivial functional dependency X -> A holds in R, then X is a superkey of R Ø More details: chapter 10 [1] Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 30

Activity Lecture 5 Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk BIS 4229 – Industrial Data Management Technologies 31

Solution Ø Proj_num proj_name Ø Emp_num emp_name job_class Lecture 5 Ø Emp_num proj_num assign_hours Ø Job_class chg_hours Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 32

Lecture 4: Introduction to SQL and OQL Lecture 5 Dr. Nawaz Khan School of Computing Science E-mail: n. x. khan@mdx. ac. uk Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 33

Unit Reading Assignment Unit 4: Introduction to SQL and OQL: Relational and Object oriented Approach Lecture 5 Ø Reading Suggestion: § Global campus materials on OASIS: http: //oasis. mdx. ac. uk/ (unit 4) § Connolly, T. M. , and Begg, C. E. , Database Systems: A Practical Approach to Design, Implementation and Management, Addison-Wesley, 4 th Edition, ISBN: 0321210255(chapters 5 and 6) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 34

Unit Learning Outcomes Lecture 5 Ø Apply SQL statements in Oracle to perform database query transactions and various calculations Ø Apply SQL commands in Oracle (CREATE) to define table structure in a relational database and to manipulate rows in a table Ø Apply SQL statements in Oracle (GRANT and REVOKE) to control data access Ø Apply Object Definition Language (ODL) to create data objects for data abstraction and nested tables Ø Apply OQL statements in Oracle to execute queries on data objects Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 35

Introduction to SQL and OQL Ø Outline § Introduction to SQL § Data Definition Language (DDL) Lecture 5 ü Create table ü Alter table ü Drop table § Data Manipulation Language (DML) ü Retrieval – SELECT-FROM-WHERE (the S-F-W structure) – Aggregate functions and grouping ü Update operations: insert, delete, update § More complex features of SQL queries/commands § Object Query in Oracle Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 36

Structured Query Language (SQL) Introduction to SQL Ø Two formal languages for the relational model: § The relational algebra Lecture 5 ü Provides a formal foundation for relational model operations ü Is used as a basis for implementing and optimizing queries in RDBMSs (relational database management systems) ü Some of its concepts are incorporated into the SQL standard query language for RDBMSs ü The user must specify HOW- that is, in what order-to execute the query operations § The relational calculus: ü Provides a higher-level declarative notation for specifying relational queries ü In relational calculus, we write one declarative expression to specify a retrieval request, and hence there is no description of how to evaluate a query ü A calculus expression specifies WHAT is to be retrieved rather than HOW to retrieve it -> nonprocedural language Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 37

Structured Query Language (SQL) Introduction to SQL Lecture 5 Ø SQL: although it includes some features from relational algebra, it is based to a greater extent on the (tuple) relational calculus. However the SQL syntax is more userfriendly than either of the two formal languages Ø SQL: Structured Query Language § § The standard language for commercial relational DBMSs ANSI 1986: SQL 1, or SQL-86 SQL 2, or SQL-92 (revised version of SQL 1) SQL 3, or SQL-99: core specification + optional specialized packages (for specific database applications such as data mining, spatial data, temporal data, data warehousing, OLAP, multimedia data, etc. ) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 38

Structured Query Language (SQL) Introduction to SQL Lecture 5 Ø SQL is very good for querying and manipulating data in database systems. Ø SQL lacks some features commonly found in general programming languages, such as LOOP and IF-THENELSE statements Ø SQL consists of three major components: § § § DDL (data definition language) to define the way in which data is stored DML (data manipulation language) to allow retrieval, insertion of data etc. (this is sometimes called the 'query ' language) DCL (data control language) to define activities that do not fit with the other two activities Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 39

Structured Query Language (SQL) Introduction to SQL Ø SQL consists of three major components (cont. ): § Lecture 5 § § The Data Manipulation Language (DML) of SQL allows the retrieval, insertion, update and removal of rows stored in relational tables. As mentioned above, numbers of rows can be altered in any one statement, and so DML is a very powerful tool The Data Definition Language (DDL) is used to create, change the structure or remove whole tables and other relational structures. So whereas you would use the INSERT statement of the DML to insert new rows into an existing table, you would use the DDL CREATE TABLE statement to establish a new table in the first place The Data Control Language (DCL) define activities that are not in the categories of those for the DDL and DML, such as granting privileges to users, and defining when proposed changes to a databases should be irrevocably made Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 40

Structured Query Language (SQL) Introduction to SQL Ø Our concerns during this lecture: § § § The Data Manipulation Language (DML) The Data Definition Language (DDL) The Data Control Language (DCL) Lecture 5 Ø Objective: How SQL is capable of: § § Creating, changing the structure or removing whole tables and other relational structures Allowing the retrieval, insertion, update and removal of rows/tuples stored in relational tables Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 41

Structured Query Language (SQL) Introduction to SQL Ø The following example is used throughout this lecture to illustrate SQL operations: § Suppose we have a university database containing the following relations: Lecture 5 STUDENT (SNO, SNAME, SLEVEL) COURSE (CNO, TITLE, LNAME) REPORT (SNO, CNO, MARK) LECTURER (LNAME, ROOM_NO) Ø Remember: § § A relation schema may be viewed as a table. An instance of a relation may be viewed as a set consisting of tuples Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 42

Structured Query Language (SQL) Data Definition Language (DDL) Ø Create table Lecture 5 § In terms of SQL statements, a relation is implemented as a table. Subsequently, a tuple is a row in a table and an attribute is a column of a table § The CREATE TABLE command is used to set up a new table (i. e. to implement a relation) in the database. The table must be given a name (corresponding to the name of the relation) and its attributes must be named and given data types § Example: To create the table (i. e. relation) STUDENT with the attributes SNO of type INTEGER, SNAME of type VARCHAR, and SLEVEL of type INTEGER, the following SQL statement is used: CREATE TABLE STUDENT (SNO INTEGER, SNAME VARCHAR(20), SLEVEL INTEGER); § Note: The data types used in the above example are those allowed in Oracle SQL. Data types in different implementations of SQL may vary slightly in that they may use different names for a particular type Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 43

Structured Query Language (SQL) Data Definition Language (DDL) Ø A number of important constraints can be defined on attributes during table creation: § NOT NULL: which means that the value of the attribute must not be NULL. This can be specified for any attribute as required § DEFAULT: which is used to specify a default value for the attribute. If there is no default value specified, the system default value is NULL Lecture 5 Ø Example: In the above example, if assume SNO is a key attribute whose value cannot be null, and a default value for SLEVEL is 0, we have the following SQL statement: CREATE TABLE STUDENT (SNO INTEGER NOT NULL, SNAME VARCHAR(20), SLEVEL INTEGER DEFAULT 0); Ø Note: If no default value is specified and NULL value is not allowed, then a value must be supplied by the user. Otherwise, an error occurs Ø Similarly, write SQL commands to create COURSE, REPORT, and LECTURER tables Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 44

Structured Query Language (SQL) Data Definition Language (DDL) Ø Create table (Implementation of Primary Key and Foreign Key) Lecture 5 CREATE TABLE ARTIST (ARTIST_ID NUMBER(2), ARTIST_NAME TEXT(30), COUNTRY_OF_ORIGIN TEXT(25), DATE_OF_BIRTH DATE, PRIMARY KEY (ARTIST_ID)); CREATE TABLE MUSIC_COLLECTION (ITEM_ID NUMBER(4), TITLE VARCHAR(40), ARTIST_ID NUMBER(2), ITEM_TYPE TEXT(1), DATE_PURCHASED DATE, PRIMARY KEY (ITEM_ID), CONSTRAINT FK_ARTIST FOREIGN KEY (ARTIST_ID) REFERENCES ARTIST (ARTIST_ID)); Ø ON DELETE SET NULL and ON DELETE CASCADE to retain data integrity Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 45

Structured Query Language (SQL) Data Definition Language (DDL) Ø Alter table Lecture 5 § The ALTER statement in SQL, like the CREATE statement, can be used to change a number of different types of data object, including tables, access privileges, constraints etc. Here we shall concentrate on its use to change the structure of tables. § You can use the ALTER TABLE statement to modify a table's definition. This statement changes the structure of a table, not its contents. You can use the ALTER TABLE statement to: ü Add a new column to an existing table: For example, to add a department head attribute to the DEPT table we could specify: ALTER TABLE DEPT ADD DEPT_HEAD NUMBER(4); ü Increase or decrease the width of an existing column: to change an EMP table so that the DEPTNO attribute can contain 3 digit values: ALTER TABLE EMP MODIFY DEPTNO NUMBER(3); ü Change an existing column from mandatory to optional (i. e. specify that it may contain nulls), etc. (homework !!) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 46

Structured Query Language (SQL) Data Definition Language (DDL) Ø Drop table § To remove a table, the DDL statement is: DROP TABLE <TABLE NAME> Lecture 5 § It is deceptively easy to issue this command, and unlike most systems one encounters today, there is no prompt at all about whether you wish to proceed with the process. Dropping a table involves the removal of all the data and constraints on the table, and finally removal of the table structure itself § Example to remove our copy of the EMP table: DROP TABLE EMP; § Using the DROP TABLE statement when creating tables Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 47

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø Basic SQL queries: The basic SQL query statement takes the so -called S-F-W form: SELECT <attribute list> FROM <table list> WHERE <condition>; where: <attribute list> is a list of attribute names whose values are to be retrieved by the query. <table list> is a list of the relation names from which the attributes to be retrieved. <condition> is a Boolean expression to specify the condition that the tuples to be retrieved have to satisfy Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 48

Structured Query Language (SQL) Data Manipulation Language (DML) STUDENT (SNO, SNAME, SLEVEL) COURSE (CNO, TITLE, LNAME) REPORT (SNO, CNO, MARK) LECTURER (LNAME, ROOM_NO) Lecture 5 Ø Query 1: Retrieve the numbers and names of all students at level 1: SELECT SNO, SNAME FROM STUDENT WHERE SLEVEL = 1; Ø Note: The result of this query is a "relation" containing the numbers and names of the level 1 students Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 49

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø Qualifying attribute names: If two or more tables (i. e. relations) are involved in the query, the attribute names must be qualified by prefixing the relation names to the attribute names Ø Query 2: Retrieve the numbers and names of students who take course 101. SELECT STUDENT. SNO, STUDENT. SNAME FROM STUDENT, REPORT WHERE STUDENT. SNO = REPORT. SNO AND REPORT. CNO = 101; Ø Note: Query 2 performs a join operation between STUDENT and REPORT on SNO Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 50

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø Query 3: Retrieve the numbers and names of students who take at least one of the courses given by the lecturer named John SELECT STUDENT. SNO, STUDENT. SNAME FROM STUDENT, REPORT, COURSE WHERE STUDENT. SNO = REPORT. SNO AND REPORT. CNO = COURSE. CNO AND COURSE. LNAME = “John"; Ø Note that we have the following tables: STUDENT (SNO, SNAME, SLEVEL) COURSE (CNO, TITLE, LNAME) REPORT (SNO, CNO, MARK) LECTURER (LNAME, ROOM_NO) Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 51

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Retrieval § SELECT-FROM-WHERE § Aggregate functions and grouping Ø Update operations Lecture 5 Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 52

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Unspecified WHERE clause and the use of “*” § Lecture 5 Unconditional queries: If the WHERE clause is not specified, no condition is imposed on the selection of tuples. As a result, all tuples of the relation specified in the FROM clause are selected. If there are two or more relations in the FROM clause, then all possible tuple combinations of the relations are returned as the query results Ø Query 4: Retrieve all student numbers. SELECT SNO FROM STUDENT; Ø Query 5: Retrieve all student numbers and course numbers. SELECT STUDENT. SNO, COURSE. CNO FROM STUDENT, COURSE; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 53

Structured Query Language (SQL) Data Manipulation Language (DML) Ø The “ * ” notation: The asterisk (*) indicates that all the attributes of the relation(s) specified in the FROM clause are to be retrieved Lecture 5 STUDENT (SNO, SNAME, SLEVEL) LECTURER (LNAME, ROOM_NO) Ø Query 6: Retrieve all the details of lecturers SELECT * FROM LECTURER; Ø Query 7: Retrieve all the details of level 1 students SELECT * FROM STUDENT WHERE SLEVEL = 1; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 54

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø The query result is a “Bag” of tuples: The query result in SQL is also a "relation". However, such a relation is not a set of tuples. Instead it is a bag of tuples. The reason is that duplicate tuples are allowed in the result of a query Ø Note: If the key of a relation is not retrieved by a query, then it is possible that the query results contain duplicate tuples SELECT SNAME FROM STUDENT WHERE SLEVEL = 1; Ø In SQL, the duplicates are not automatically removed from the resulting relation. However, the key word DISTINCT can be used to eliminate any duplicate tuples Ø Query 8: Retrieve all the different names of the students at level 1 SELECT DISTINCT SNAME FROM STUDENT WHERE SLEVEL = 1; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 55

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø Ordering tuples in query results: The ORDER BY clause can be used to order tuples in the query result. The order can either be ascending by using the key word ASC, or descending by using the key word DESC. If no key word is specified, the default order is ascending Ø Query 11: Retrieve the numbers and names of all level 1 students ordered in an ascending alphabetical order according to their names. SELECT SNO, SNAME FROM STUDENT WHERE SLEVEL = 1 ORDER BY SNAME ASC; Ø Query 12: Retrieve the numbers and names of students, and the numbers of the courses that the students are taking, ordered by the course number, and related to each course by students' names. SELECT STUDENT. SNO, STUDENT. SNAME, COURSE. CNO FROM STUDENT, COURSE WHERE STUDENT. SNO = COURSE. SNO ORDER BY COURSE. CNO DESC, STUDENT. SNAME ASC; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 56

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Retrieval § SELECT-FROM-WHERE § Aggregate functions and grouping Ø Update operations Lecture 5 Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 57

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Aggregate functions and grouping § Five built-in functions: Lecture 5 ü COUNT: counts the number of values in a specified column (i. e. an attribute) ü SUM: totals the values in a specified column with a numeric type ü AVG: calculates the average of the values in a specified column with a numeric type ü MIN: returns the minimum value in a specified column ü MAX: returns the maximum value in a specified column § Note: The above built-in functions can operate on the values of a single attribute of a permanent relation or the resulting relation of a query Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 58

Structured Query Language (SQL) Data Manipulation Language (DML) Lecture 5 Ø Query 13: Find the highest, the lowest, and the average marks in course 101 SELECT MAX(MARK), MIN(MARK), AVG(MARK) FROM REPORT WHERE CNO = 101; Ø Query 14: Find out how many courses the lecturer John is teaching SELECT COUNT(*) FROM COURSE WHERE LNAME = “John"; Ø Note: COUNT(*) returns the number of rows in the resulting table (i. e. relation) of the query. “*” refers to all the tuples (i. e. rows) Ø Query 15: Find out how many different names are used by level 1 students SELECT COUNT(DISTINCT SNAME) FROM STUDENT WHERE SLEVEL = 1; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 59

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Retrieval § SELECT-FROM-WHERE § Aggregate functions and grouping Lecture 5 Ø Update operations: in SQL, there are three commands to modify data in a database: § INSERT: to insert a new tuple into a table that has already been created § DELETE: to remove an existing tuple(s) from a table § UPDATE: to change the value of a specified data item in an existing tuple Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 60

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Insert: § Example: Insert into the relation STUDENT (previously created) a new tuple with following attribute values: Lecture 5 SNO = 2020, SNAME = "James Bond", SLEVEL = 0 § Two forms of the INSERT statement: (1) INSERT INTO STUDENT VALUES (2020, "James Bond", 0); (2) INSERT INTO STUDENT(SNO, SNAME) VALUES (2020, "James Bond"); § Note: ü In the first case, the values listed in the VALUES clause must be corresponding to the attributes specified in the table creation operation. The order of values is crucial: STUDENT (SNO, SNAME, SLEVEL) ü In the second case, the values in the VALUES clause are corresponding to the attributes specified in the INSERT INTO clause. Those attributes that are not included in the INSERT INTO clause will take either the default values (if there any) or the NULL value Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 61

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Delete: § Example: Delete all level 3 students. DELETE FROM STUDENT WHERE SLEVEL = 3; Lecture 5 § Note: ü The DELETE command operates on only one table at a time ü If the condition specified in the WHERE clause is FALSE, no tuple is deleted ü If the condition evaluates to TRUE, one or more tuples may be removed ü If no condition is specified (i. e. no WHERE clause), all tuples in the table are removed ü Even if all tuples of a table are deleted, the table still exists in the database. The DROP TABLE command may be used to remove a table completely from the database Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 62

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Update: § Example: Change the lecturer on course 251 from “John” (or whoever) to Anderson. Lecture 5 UPDATE COURSE SET LNAME = “Anderson” WHERE CNO = 251; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk BIS 4229 – Industrial Data Management Technologies 63

Structured Query Language (SQL) Data Manipulation Language (DML) Ø Retrieval § SELECT-FROM-WHERE § Aggregate functions and grouping Lecture 5 Ø Update operations: in SQL, there are three commands to modify data in a database: § INSERT: to insert a new tuple into a table that has already been created § DELETE: to remove an existing tuple(s) from a table § UPDATE: to change the value of a specified data item in an existing tuple Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 64

Structured Query Language (SQL) Ø Outline § Introduction to SQL § Data Definition Language (DDL) Lecture 5 ü Create table ü Alter table ü Drop table § Data Manipulation Language (DML) ü Retrieval – SELECT-FROM-WHERE (the S-F-W structure) – Aggregate functions and grouping ü Update operations: insert, delete, update § More complex features of SQL queries/commands § Summary Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 65

Structured Query Language (SQL) More complex features of SQL queries/commands Ø Nested queries and set membership testing § Lecture 5 § Nested queries: A database query may consist of a number of subqueries when the result of a sub-query form the condition (or part of the condition) of another query. Such a complex query is called nested query. A special operator IN is introduced to examine whether a tuple is a member of the resulting relation of a sub-query Query 9: Retrieve the numbers and names of students who take course 101 (same as query 2) SELECT SNO, SNAME FROM STUDENT WHERE SNO IN (SELECT SNO FROM REPORT WHERE CNO = 101); Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 66

Structured Query Language (SQL) More complex features of SQL queries/commands Ø Nested queries and set membership testing (cont. ) § Query 10: Retrieve the numbers and names of students who take at least one of the courses given by lecturer John (same as query 3) Lecture 5 SELECT SNO, SNAME FROM STUDENT WHERE SNO IN (SELECT SNO FROM REPORT WHERE CNO IN (SELECT CNO FROM COURSE WHERE LNAME = “John")); Ø Note: § § Generally a nested query using the IN operator can always be expressed as a simple (i. e. unnested) query. For example, query 9 and 2, and query 10 and 3 Theoretically a query can be nested to any depth Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 67

Structured Query Language (SQL) More complex features of SQL queries/commands Ø Other clauses: Lecture 5 § § § Group By Having Union Minus Intersect etc. Group By: SELECT DEPTNO, COUNT(EMPNO) FROM EMP GROUP BY DEPTNO; This gives the following results: DEPTNO COUNT(EMPNO) -----------10 3 20 5 30 6 Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk Having Clause: SELECT DEPTNO, COUNT(EMPNO) FROM EMP GROUP BY DEPTNO HAVING COUNT(EMPNO) > 4; Giving the results: DEPTNO COUNT(EMPNO) -----------20 5 30 6 68

Object Query in Oracle Lecture 5 Ø Oracle provides the facilities to create objects. The classes are implemented in oracle using extended version of DDL. These classes in oracle are called object types. The object types are also known as Abstract Data Types (ADT). There are three different kinds of objects types that can be implemented by using DDL. These are: § Column type § Collection type Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 69

Column Type: Example Lecture 5 Create or replace type Emp. Address as object ( Street Varchar 2(30), Town Varchar 2 (30), City Varchar 2(30), Post. Code Varchar 2(7)); Create or replace type Employee as object ( Employee. ID number(2), Name varchar 2 (30), DOB date, Address Emp. Address, Member Function Emp. Age Return Number ); Create or replace type Employee as object Create or replace type body Employee as ( Employee. ID number(2), Member Function Emp. Age Return Number Name varchar 2 (30), Is Begin DOB date, Return Trunc(((SYSDATE – Address Emp. Address); Self. DOB)/365), 1); End; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 70

Collection Type Oracle supports for storing of multiple values in a single row. This can be implemented in oracle by applying the following approaches: Lecture 5 § Varying arrays or VARRAY § Nested tables Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 71

Collection type: VARRAY Lecture 5 Create or replace type Skill_VA as varray(5) of varchar 2(20); Creates a varray, named Skill_VA, which can store maximum 5 values. The newly created varray is of datatype varchar 2. The varray Skill_VA can also be used as a datatype in other tables. Create or replace type Employee as object ( Employee. ID number(2), Name varchar 2 (30), DOB date, Address Emp. Address, Skills Skill_VA ); Insert into Employee Values(12, ‘Amer Jahed’, ’ 12 -JAN-1972’, Emp. Address(’ 12 Big Lane’, ‘Croydon’, ‘London’, ‘CR 1 2 TD’), Skill_VA(‘Leadership’, ‘Communication’, ‘Computing’, ‘language’, ‘Management’)); Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 72

Inserting values in VARRAY Lecture 5 Insert into Employee Values(12, ‘Amer Jahed’, ’ 12 -JAN-1972’, Emp. Address(’ 12 Big Lane’, ‘Croydon’, ‘London’, ‘CR 1 2 TD’), Skill_VA(‘Leadership’, ‘Communication’, ‘Computing’, ‘language’, ‘Management’)); Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 73

Collection type: Nested Table Lecture 5 Ø Step 1: Create a type for dependants Create or replace type Emp. Dependants as object (Dependant. Name varchar 2(20), Dependant. DOB Date); Ø Step 2: Create a table of type Emp. Dependants datatype Create or replace type Emp. Dependants_NT as table of Emp. Dependants; Ø Step 3: Create a table Employee that uses dependant nested table Create table Employee ( Employee. ID number(2), Name varchar 2 (30), DOB date, Dependants Emp. Dependants_NT ) Nested table Dependants store as Emp. Dependants_NT_TAB; Dr. Nawaz Khan, School of Computing Science E-mail: n. x. khan@mdx. ac. uk 74
- Slides: 74