Slides 4 The Relational Database Model The Relational
Slides 4 The Relational Database Model:
The Relational Database Model • Based on theory of relational math (set theory) • It is an “automatic transmission” database (with embedded relationships between tables) which replaces the “standard transmission” database (which employs flat-file techniques with explicit pointers between files and records) • Flat-files (collections of similar records) are being replaced by collections of interrelated files • Allows data to be broken down into logical, smaller, more manageable units - simplifies the organization of complex sets of data
Why A Relational Model? • Duplicate data reduced - less input, maintenance, storage, and improved data integrity • Data independence: Data can be thought of as being stored in tables regardless of how physically stored. • Application independence: Databases defined independently from the systems and programs that will use them - allows users to create ad hoc queries, rather than only receive pre-specified reports • A change in the database does not require rewriting all the application program codes. Ability to share same data across multiple applications and systems. • It has the ability to maintain several tables of related information that can be accessed by several different users in many different ways - a single query can retrieve
Some Definitions. . . Data: Raw facts about the organization and its business transactions that are of interest to the end user Database: A computer structure that houses a collection of data Relational database: Stores information about instances of entities (a specific sales event, salesperson), attributes of those entities (invoice no. , salesperson ID) , and the relationships among these entities (each sale can only have one salesperson) - perceived by user to be a collection of two-dimensional tables RDBMS: Software that manages a relational database, controls access, and allows users to retrieve requested data through a standard data-access language, SQL.
Entity-type: Something of significance about which you want to store data in a database, e. g. , customers, employees, suppliers, inventory items (note: this is a data modeling term – an entity becomes a table in a RDBMS) Table: An entity-type (e. g. , customer) and its attributes Attribute: A property or characteristic of an entity. A column in a relational database table, e. g. , customer name, reference #, address, zip ((note: this is a data modeling term – an attribute becomes a column in a RDBMS Row (tuple, record): A record of data in a database table a single occurrence or entity instance Value: Data in a “cell” – the intersection between row and
Types of Attributes Key (identifier in data modeling): Attribute, or combination of attributes, that determines the values of other attributes in each row Composite Key: Multiple-attribute keys; may be further subdivided, e. g. , phone may be area code and number - can be a primary key Candidate Key (CK): Attribute (or a minimum combination of attributes) that uniquely identifies each row in a given table - there can be more than one CK (employee entity type: SSN; assigned ID#) Primary Key (PK) ( a unique identifier in data modeling): A CK selected to uniquely identify all other attributes in a given row; cannot be Null Foreign Key (FK): ( a relationship in data modeling): Attribute (combination of attributes) whose value(s) must match the Primary Key in another table in the same database, or whose value(s) must be Null Non-key Attribute: Attribute that is not part of a key
Attributes With A Null Value • Null Value: An unknown attribute value (e. g. , salesperson not yet allocated to a customer) - it is not a zero. It is an optional attribute. • Inclusion of nulls in a table is important - they provide a consistent way to distinguish between valid data such as a 0 and missing data, e. g. , an account payable with 0 is good to see; one with an unknown balance can indicate a significant problem • In most cases, nulls appear as blanks on a query’s result table on a screen
Relationships • Data modeling term that indicates an association between tables: How the things of significance are related (A FK must match to an existing PK, or else be NULL) • This controlled redundancy allows linking of tables (hence “relational”) • Entity-Relationship Diagram (ERD): A data model (at the conceptual level) that shows the relationships enforcing business rules between entities (tables) in a database environment (Fig. 5. 4)
Business Rules • Narrative descriptions of policies, procedures, or principles in an organization • Examples: – A pilot cannot be on duty for more than 10 hours in a 24 -hour period – A professor must teach at least three classes in a semester – A class may not have fewer than 10 enrollments
Normalization • Process of taking a “raw” database and breaking it into logical units called tables, by following theoretical rules called normal forms • The intent is to create a degree of controlled redundancy that allows two or more tables to be joined, by matching a FK in one table to a PK in another table • Referential integrity (constraint created upon table creation) is enforced when every non-null FK value must match an existing PK value (if there is a FK, there has to be a PK for that FK in another table) • Normalization has six nested normal forms • Generally a well-formed business database will be normalized through 3 rd normal form (3 NF)
Benefits of Normalization • • • Greater overall database organization Minimize data redundancies Data consistency within the database A more flexible database design Data can be used more productively A better handle on database security Disadvantage of Normalization • Reduced database performance because database must locate requested tables and join data - requires additional processing logic
Normal Forms —Normalization through a series of stages called NORMAL FORMS – Each NF depends on normalization steps taken in the previous NF • First Normal Form - 1 NF • Second Normal Form - 2 NF • Third Normal Form - 3 NF
1 NF First normal form rules: • All key attributes must be defined • There must be no repeating groups (values), i. e. , each row/column intersection can have only one value • All attributes must be functionally dependent on the PK, or part of the PK - e. g. , SSN determines DOB, but DOB cannot determine SSN Hint: Put all attributes in a two-dimensional flat table, with no repeating values
General Journal Entry: Traditional View - Unnormalized Assume that the transaction # will reset to 1 at the beginning of the next fiscal year
GJ: First Normal Form
2 NF – Second Normal Form Rules: • Table is in 1 NF; and • Table includes no partial dependencies; that is, no attribute is dependent on only portion of the primary key – must be dependent on entire PK Hint: Examine non-key attributes to determine whether any are dependent on only portion of a composite PK - this would violate 2 NF If a table only has one attribute as a PK, then it is in 2 NF.
Chart of Accounts Table
Transaction Listing Table
Transaction Detail Table (Base Table)
3 NF – Third Normal Form Rules: • Table is in 2 NF and • There are no transitive dependencies Hint: You will violate 3 NF if you can deduce the value of a non-key attribute by knowing the value of another non-key attribute
Normalized Transaction Detail (Base) Table
- Slides: 21