Database Management System Contents Relational Database Design RDBMS

Database Management System Contents Relational Database Design RDBMS Vs. DBMS Components of RDBMS Keys Integrity Rules Views Database Languages Codd’s Rules. Dependencies Normalization of Database

Database Management System Relational Database Management System ◦ Relational Database design, proposed by Dr. E. F. Codd in 1970 s, stands for database organized in the form of tables and generally knowns as Relational Database Management System (RDBMS). ◦ Relational Database Design provides guidelines to define tables, columns and establish relationships among tables. ◦ Tables can be connected to each via common attributes. ◦ In RDBMS simple database can be spread across several tables. ◦ RDBMS is concerned with three aspects of data: Relational Data Structure: A uniform structure i. e. table Relational Data Integrity rules: set of rules for data consistency Relational Data Manipulation: set of operations to transform relations into some other relations

Database Management System Relational Database Management System ◦ Components of RDBMS: Table Relation Row Record, Tuple Column Attribute, Field Domain Set of Legal Values Cardinality Number of Rows Degree Number of Columns Primary Key Unique Identifier Foreign Key Reference Identifier used to relate two relations

Database Management System Relational Database Management System ◦ General Example of RDBMS:

Database Management System Relational Database Management System ◦ A RDBMS generally performs following operations

Database Management System Relational Database Management System Components of DBMS: ◦ Relation: everything in RDBMS is stored in the form of a table called relation. Properties of a relation are: Each relation in the database has a unique name. A relation does not have a duplicate Tuple so Tuples should be uniquely identified by an attribute or composite attributes. Order of attributes and tuples is irrelevant. Values must be atomic. Relational schema is made up of table and attribute names. Number of rows are not bound as data can be insert or remove any time. Ex: Table EMPLOYEE can come in the database only once.

Database Management System Relational Database Management System Components of DBMS: ◦ Attribute: An attribute of a relation is the characteristic of the relation. Properties of the attribute are: Every attribute of the table must have a unique name. Order and change in order of attribute is irrelevant Data type for all the values of a single attribute must be same. Attribute that uniquely identifies the table is called Primary Key Constraints can be applied on the table attributes. Null and default values are permitted for the attributes. EMP_ID EMP_NAME EMP_DESIG EMP_SALARY

Database Management System Relational Database Management System Components of DBMS: ◦ Tuple: The row of data in a table is called tuple or record. Properties of a tuple are: no two tuples are identical in a relation. Order of tuple is irrelevant. All tuples of the relation have the same format and same number of entries. ◦ Data Item: A value in the single cell of a relation is data item. Properties of Data item are: Data must be atomic. The data type of the same attribute must be same. The data item for an attribute should be drawn from the same domain.

Database Management System Relational Database Management System Components of DBMS: EMP_ID EMP_NAME EMP_DESIG EMP_SALARY 101 Amit Khanna Manager 25000 102 Sunil Kumar Clerk 15000 103 Sunil Kumar Clerk 15000 Degree: the total number of columns or attributes in a relation. If there is one attribute then it is unary relation, if two the binary relation, if three then ternary relation and if n then n-ary relation. Cardinality: the total number of tuples in the relation. Domain: Possible value each attribute can contain. Domain ca be limited by applying set of constraints on them. Domain depend upon data type so data type must be chosen before deciding domain.

Database Management System RDBMS Key: A key is an attribute of the a relation that helps to identify each row uniquely. Roll_No S_Name Class Marks 101 Amit B. A. -1 750 102 Rohan B. A. -1 622 103 Amit B. A. -1 750 104 Aman B. A. -1 678 A key must have two important properties: Unique Identification: No duplicate value Irreducible: the attribute that is being used as key cannot be further decomposed Null values are not allowed in the key attribute.

Database Management System RDBMS Keys can be of following types: ◦ Candidate Key ◦ Super Key ◦ Primary Key ◦ Alternate Key ◦ Composite Key ◦ Foreign Key ◦ Artificial Key Roll_No S_Name Class Marks 1012 Amit M. B. A. -1 750 1702 Rohan B. C. A. -11 622 803 Amit B. B. A. -111 750 1104 Aman M. A. -11 678

Database Management System RDBMS Candidate Key: All those attributes that uniquely identify the row and cannot be decomposed anymore are Candidate key of the relation. Properties of Candidate Key are: ◦ A relation can have more than one candidate key ◦ A candidate key cannot have any null values ◦ It should be unique and irreducible. ◦ A relation must have at least one candidate key. ◦ In some rare cases all the attributes of the relations can be candidate key. E_ID Ename EPAN EDesign Salar

Database Management System RDBMS Super Key: Any attribute or set of attribute that uniquely identifies each row but may or may not hold irreducible property is super key of the relation. Properties are: ◦ There should be at least one super key. ◦ A candidate key is a subset of a super key that is irreducible. ◦ A super key can be a combination of all or any number of attributes that can uniquely identify each row of the table.

Database Management System RDBMS Primary Key: Attribute or group of attributes that is chosen from the candidate key to uniquely identify the relation is called Primary key of the relation. This key is used in each operation on the database to identify, locate or refer to a one particular tuple in the relation. The selection of the primary key among the candidate key depends upon following conditions: ◦ Minimal: it should have minimum number of attribute in it through which each row can be identified uniquely. ◦ Accessible: it must be accessible to each and every user. ◦ Not Null Value: the attributes in Primary Key cannot have null values. ◦ Time Invariant: The values of the primary key must not be changeable in different scenario. ◦ Unique: the attribute in Primary key must not have duplicate values.

Database Management System RDBMS Primary Key: Properties of Primary Key are: ◦ A relation can contain only one Primary Key. ◦ A Primary key can be a single attribute or a composite key containing multiple attributes. ◦ It is the minimum super key. ◦ The data values in primary key cannot be null. ◦ The attribute which creates primary key are primary attributes of the relation. ◦ Primary key is always chosen from the possible candidate keys. ◦ If the primary key contain more than one attribute then it must be irreducible. ◦ Primary key cannot contain null values as they are not unique.

Database Management System RDBMS Composite Key: If a primary key consists more than one attributes then it is a known as composite key. Properties are: ◦ A composite key cannot have null values. ◦ It must ne part of candidate key. ◦ It must be irreducible. E_ID D_ID Ename Dname 102 10 Arjun Sales 102 20 Arjun Marketing 103 30 Rajan IT 103 40 Rajan Planning 104 50 Harman HR

Database Management System RDBMS Alternate Key: among all the candidate keys one is chosen as Primary key rest all are the alternate keys of the table. Properties are: ◦ The total number of alternate keys are total number of Candidate keys minus Primary Key. ◦ It may exist or may not exist i. e. if there is one candidate key then relation does not have any alternate key. Alternate key must hold all the properties of primary key. Foreign Key: two relations are related to each other via some common attribute. This common attribute is Primary Key of first table and Foreign Key in second table. Properties are: ◦ For each Foreign key there must be a corresponding primary key. ◦ Foreign keys are usually defined after populating the relation ◦ There can be only one foreign key but it is not necessary to be a part of the relationship. ◦ It must match the values of the primary key or may contain null value.

Database Management System RDBMS Foreign Key: ◦ Both primary key and foreign key may exist in the same relation. ◦ Usually represented with italic characters. ◦ Foreign key relation is called child relation of the parent relation that contains the corresponding primary key E_ID ENAME Salary D_NO DNAME E 0001 Rahul 15000 D 1 Accounts E 0002 Pankaj 18000 D 3 D 2 Marketing E 003 Aman 19755 D 2 D 3 HR E 004 Rajan 45777 NULL

Database Management System RDBMS Artificial Key: ◦ If no attribute in the relation can be used as the primary key then an artificial attribute can be generated and used as primary key. The said attribute will be called artificial key. ◦ The data value of artificial key are usually a sequential order. ◦ Artificial key can also be used when Primary key very large and complex. Fruit name Price per Kg S. NO. Fruit name Price per Kg Mango 150 Apple 200 1 Mango 150 Strawberry 300 2 Apple 200 Orange 150 3 Strawberr y 300 4 Orange 150

Database Management System RDBMS Database Languages: Database languages are the software instruction package that enables the users to access, modify and maintain the database. Database languages provides instructions to create, update, insert, delete and retrieve the database. Every Database Language package includes three types of languages: ◦ Data Description/Definition Language (DDL) ◦ Data Manipulation Language (DML) ◦ Data Control Language (DCL) Database Languages DDL DML DCL

Database Management System RDBMS Database Languages: ◦ Data Description/Definition Language (DDL): DDL commands are used to create, modify and remove the structure of the database objects. All the integrity constraints are specified using these commands. The output of these commands is placed in the data dictionary. Commands are CREATE, ALTER, DROP, RENAME, TRUNCATE, COMMENT etc. ◦ Data Manipulation Language: DML commands enable the user to manipulate data in the database i. e. insert, delete, modify and retrieve the data in the database. DML commands are INSERT, UPDATE, SELECT, DELETE. ◦ Data Control Languages: DCL is used to control access to the data stored in a database. These commands grant and revoke the user privileges. DCL commands are GRANT, REVOKE

Database Management System RDBMS Integrity Rules: ◦ Integrity rules refers to correctness and accuracy of data in the database. In a shared multi-user database data integrity is the most important factor. ◦ Two general integrity rules are: Entity Integrity Rule: it concerns the primary key in the relation. Referential Integrity Rule: this rule states that if a foreign key exists in a relation then its value must either match the corresponding primary key or null. Views: A view is a virtual relation derived from an actual relation. Views enable users to analyze the data from different aspects.

Database Management System RDBMS Views: Roll_No Sname DOB Class Marks Address 101 Raman 01 -4 -1987 B. A-1 1870 Ldh 102 Pankaj 05 -121988 B. B. A. -2 1250 Delhi 103 Rohan 18 -091988 M. B. A. 1 1350 Chd 104 Rajan 20 -121990 M. C. A. -1 1450 Mumbai 14 -081989 B. C. A. -3 Marks 1687 Mohali 105 Roll_No Amit Roll_No Name Address Class Marks

Database Management System RDBMS Codd’s Rules: A set of rules to determine whether the database is a relational database or not. ◦ Information Rule: Information in DBMS must be in the form of tables. ◦ Guaranteed Access: Data must be guaranteed to access via table name, primary key, column name. ◦ Special Treatment of Null Values: Null Values means Missing information or inapplicable information not 0. ◦ Database Description Rule: at the logical level database description is same as the ordinary level. So users must be able to manipulate it same as the ordinary level. ◦ Comprehensive Data Sublanguage: RDBMS can support many languages but at least one of them must support: Definition of tables and view, Query and updating of the data, set integrity constraints, set authorization and define transactions. ◦ View Updating: any change in the data view must reflect back in the database.

Database Management System RDBMS ◦ High Level Insert, Update and Delete: RDBMS must be capable of insertion, updating and deletion of the data as well as retrieval of the data from the database. ◦ Physical Data Independence: the execution of the user level requests do not effect the physical data access and storage methods. ◦ Logical Data Independence: Logical changes i. e. adding/deleting fields or fields length does not effect user level requests. ◦ Integrity Independence: integrity constraints are part of data dictionary and can be changed without effecting user requests. ◦ Distribution Independence: User need not to know that whether database is distributed or not. Application program will work even if the programs and data are moved on different sites. ◦ Non-Subversion Rule: if system supports low-level access to the data then it must not bypass the integrity rules of the database.

Database Management System RDBMS Dependencies: A dependency occurs in a database when information stored in the same database table uniquely determines other information stored in the same table. You can also describe this as a relationship where knowing the value of one attribute (or a set of attributes) is enough to tell you the value of another attribute (or set of attributes) in the same table. Functional Dependency: a dependency when one attribute in one relation is completely dependent on another attribute in the relation. Fully Functional Dependency: if the primary key is composite key then the other attributes must be determined by all the attributes in primary key. Multi-valued Dependency: Two attributes (or columns) in a table are independent of one another, but both depend on a third attribute

Database Management System RDBMS Normalization: ◦Normalization is the process of splitting relations into well structured relations that allow users to insert, delete, and update tuples without introducing database inconsistencies. ◦Without normalization many problems can occur when trying to load an integrated conceptual model into the DBMS. These problems arise from relations that are generated directly from user views are called anomalies. ◦There are 5 NF: - First Normal Form (1 NF), Second Normal Form (2 NF), Third Normal Form (3 NF), Boyce-Codd Normal Form BCNF, Forth Normal Form (4 NF), Fifth Normal Form (5 NF). ◦Usually a relation is normalized up to 3 NF. ◦Each normalized relation is called to be in normal form NF.

Database Management System RDBMS Normalization: ◦Update Anomalies: Anomalies are the problems that can occur in a poorly planned, un-organized database where the whole data is stored in one table. Roll. NO S_Name DOB Class Address There are three types of anomalies: update, deletion and insertion anomalies. Insertion Anomaly: When a User can not insert new data due to the lack of the data is called insertion anomaly. Update Anomaly: An Update anomaly is the inconsistency of the data due to update command. Deletion Anomaly: A deletion anomaly is the unintended loss of data due to deletion of other data.

Database Management System RDBMS Normalization: • First Normal Form (1 NF): Any table that has atomic value for each attribute and every row is unique i. e. Primary key is defined, is called in its 1 NF. St_id 2407 2480 S_Name Aman Ronit • This table is not a normalized relation. Record Subject Marks Compute r 55 Maths 66 English 78 Compute r 45 Maths 78 English 89

Database Management System RDBMS Normalization: • To turn it in to 1 NF we can decompose it further. St_Id S_Name 2407 Aman 2480 Ronit St_Id Subject Marks 2407 Computer 55 2407 Maths 66 2407 English 78 2480 Computer 45 2480 Maths 78 2480 English 89

Database Management System RDBMS Normalization: • Second Normal Form (2 NF): A relation is said to be in 2 NF if it is already in 1 NF and each every non-key attribute fully depends on the primary key of the relation. • If any attributes is not dependent on primary attribute then it is not in 2 NF. Roll_NO S_Name T_Id T_Name Class 102 Amit 21 Raichand BBA 103 Rohan 24 Singla BCA 104 Rajan 24 Singla BA Roll_NO S_name Class T_Id T_Name 102 Amit BBA 21 Raichand 103 Rohan BCA 24 Singla 104 Rajan BA

Database Management System RDBMS Normalization: ◦ Third Normal Form (3 NF): A relation is said to be in 3 NF if it is in 2 NF and There is no transitive functional dependency. ◦ By transitive functional dependency, we mean we have the following relationships in the table: A is functionally dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on A via B.

Database Management System RDBMS Normalization: ◦ Boyce-Codd Normal Form(BCNF): BCNF is a stronger version of 3 NF. BCNF is required if: There are multiple composite candidate keys. Multiple candidate keys are overlapped i. e. candidate keys share a common attribute. Author Nationality Book_Title Genre No of pages Shakespeare English Hamlet Tragedy 100 Markus Winand Austrian SQL Textbook 200 M. Mano American CSM Textbook 300 Book_Title Genre No of Pages Author Book_Title Author Nationality

Database Management System RDBMS Normalization: ◦ Forth Normal Form(4 NF): A relation is in 4 NF it is in 3 NF and BCNF. It contains no multi-valued dependencies (two attributes (or columns) in a table are independent of one another, but both depend on a third attribute). Name Department Salary Name Salary ◦ Fifth Normal Form (5 NF): A relation is in 5 NF if it is in 4 NF and cannot have further lossless decomposition.

Chapter End
- Slides: 35