Department of Computer and Information Science School of

  • Slides: 9
Download presentation
Department of Computer and Information Science, School of Science, IUPUI Data Modeling Introduction and

Department of Computer and Information Science, School of Science, IUPUI Data Modeling Introduction and Normalization Dale Roberts, Lecturer Computer Science, IUPUI E-mail: droberts@cs. iupui. edu Dale Roberts 1

Sample Table Student. Name Advisor. Name Course. ID Course. Description 1 Course. Instructor Course.

Sample Table Student. Name Advisor. Name Course. ID Course. Description 1 Course. Instructor Course. ID Course. Description 2 1 Name 1 2 Al Gore Bill Clinton VB 1 Intro to Visual Basic Bruce Mc. Kinney DAO 1 Intro to DAO Programming Dan Quayle George Bush DAO 1 Intro to DAO Joe Garrick VBSQL 1 Client/Server Programming with VBSQL George Bush Ronald API 1 API Programming Dan Appleman OOP 1 Object Oriented Reagan with VB Programming in VB Walter Jimmy Carter VB 1 Intro to Visual Basic Bruce Mc. Kinney API 1 API Programming Mondale with VB Dale Roberts Course. Instructor Name 2 Joe Garrick William Vaughn Deborah Kurata Dan Appleman 2

Sample Table Problems Repeating Groups The course ID, description, and instructor are repeated for

Sample Table Problems Repeating Groups The course ID, description, and instructor are repeated for each class. If a student needs a third class, you need to go back and modify the table design in order to record it. While you could add Course. ID 3, Course. ID 4, Course. ID 5, etc. , along with the associated description and instructor fields, no matter how far you take it there may one day be someone who wants one more class. Additionally, adding all those fields when most students would never use them is a waste of storage. Update Anomalies Let's say that after entering these rows, you discover that Bruce Mc. Kinney's course is actually titled "Intro to Advanced Visual Basic". In order to reflect this change, you would need to examine all the rows and change each individually. This introduces the potential for errors if one of the changes is omitted or done incorrectly. Delete Anomalies If you no longer wished to offer Joe Garrick's Intro to DAO class, you would need to null out all references to the course. Once you remove all references to the course, you have lost all information about the course such as Course Id, Course Name, and Instructor. Should not offering the course mean that the course no longer exists? Insert Anomalies Perhaps the department head wishes to add a new class - let's call it "Advanced DAO Programming" - but hasn't yet set up a schedule or even an instructor. What would you enter for the student, advisor, and instructor names? Dale Roberts 3

Normalization Defined • 1 st Definition. . . the process of taking a wide

Normalization Defined • 1 st Definition. . . the process of taking a wide table with lots of columns but few rows and redesigning it as several narrow tables with fewer columns but more rows. A properly normalized design allows you to use storage space efficiently, eliminate redundant data, reduce or eliminate inconsistent data, and ease the data maintenance burden. . . Jose’s Database Programming Corner www. citilink. com/~jgarrick/vbasic/database • 2 nd Definition Procedure to ensure that a data model conforms to some useful standards. . to minimize the duplication of data, to provide the flexibility necessary to support different functional requirements, and to enable the data modeler to verify the business requirements. Oracle 7. 3 Developer’s Guide, p. 466 • 3 rd Definition Normalization is the process of putting things right, making them normal. . . separating elements of data into affinity groups, and defining the normal, or “right”, relationships between them. Oracle: The Complete Reference Dale Roberts

Forms of Normalization • First Normal Form (1 NF) no repeating groups. • Second

Forms of Normalization • First Normal Form (1 NF) no repeating groups. • Second Normal Form (2 NF) 1 NF + no nonkey attributes depend on a portion of the primary key. • Third Normal Form (3 NF) 2 NF + no attributes depend on other nonkey attributes. “Each column depends on the key, the whole key, and nothing but the key, so help me Codd. ” Dale Roberts

Database Management: Normalization • First Normal Form (1 NF) – Each column contains values

Database Management: Normalization • First Normal Form (1 NF) – Each column contains values about the same attribute, and each table cell value must be a single value. – Each column has a distince name, order of columns is immaterial. – Each row is distinct, rows cannot be duplicate for the same key – The sequence of rows is immaterial. • Second Normal Form (2 NF) – All non-key attributes must be fully dependent on the whole key. • Third Normal Form (3 NF) – Each nonkey attribute should be dependent only on the relation’s key, not on any other nonkey. Dale Roberts 6

The Dangers in a Relational Database How to reduce the confusion… • Normalization 1

The Dangers in a Relational Database How to reduce the confusion… • Normalization 1 NF: 2 NF: 3 NF: • • English names for tables and columns English code names Dale Roberts 7

The Dangers in a Relational Database Dale Roberts 8

The Dangers in a Relational Database Dale Roberts 8

Acknowledgements Mc. Fadden and Hoffer. Database Management Loney, Kevin. Oracle Database 10 g The

Acknowledgements Mc. Fadden and Hoffer. Database Management Loney, Kevin. Oracle Database 10 g The Complete Reference Ullman, Jeff. Database Systems The Complete Book. Database. Answers. com Dale Roberts 9