Pearson BTEC Level 5 HND Diploma in Computing
Pearson BTEC Level 5 HND Diploma in Computing and Systems Development (QCF) Unit 33: Data Analysis and Design UNIT CODE : D/601/1991 QC F LE VEL 5: BTEC HIGHER NAT IONAL CREDIT VA LU E 15 MICRONET INTERNATIONAL COLLEGE | DATA ANALYSIS AND DESIGN 1
Aim To provide learners with the knowledge and skills needed to understand, design, query and implement database systems. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 2
Learning Outcomes On successful completion of this unit a learner will: 1. Understand data models and database technologies 2. Be able to design and implement relational database systems 3. Be able to use manipulation and querying tools 4. Be able to test and document relational database systems. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 3
L 01: Understand data models and database technologies Assessment Criteria: 1. 1 Critically compare different data models and schemas 1. 2 Critically discuss the benefits and limitations of different database technologies 1. 3 Analyze different approaches to database design MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 4
Data Models L 01: UNDERSTAND DATA MODELS AND DATABASE TECHNOLOGIES MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 5
DATA MODELS MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 6
Data Model A Data Model is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world. For instance, a data model may specify that a data element representing a car comprise a number of other elements which in turn represent the color, size and owner of the car. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 7
Data Model Data modeling in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 8
Data Model Data Models are fundamental entities to introduce abstraction in a DBMS. Data models define how data is connected to each other and how they are processed and stored inside the system. The very first data model could be flat datamodels, where all the data used are to be kept in the same plane. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 9
Data Model Approaches 1. Conceptual Data Modeling 2. Enterprise Data Modeling 3. Logical Data Modeling 4. Physical Data Modeling MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 10
Data Model Approaches Conceptual Data Modeling Identifies the highest-level relationships between different entities. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 11
Data Model Approaches Enterprise Data Modeling Similar to conceptual data modeling, but addresses the unique requirements of a specific business. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 12
Data Model Approaches Logical Data Modeling Illustrates the specific entities, attributes and relationships involved in a business function. Serves as the basis for the creation of the physical data model. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 13
Data Model Approaches Physical Data Modeling Represents an application and database-specific implementation of a logical data model. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 14
Hierarchical Data Model A hierarchical database model is a data model in which the data is organized into a tree-like structure. The data is stored as records which are connected to one another through links. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 15
Hierarchical Data Model A record is a collection of fields, with each field containing only one value. The entity type of a record defines which fields the record contains. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 16
Network Data Model The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 17
Network Data Model MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 18
Relational Data Model Relational data model is the primary data model, which is used widely around the world for data storage and processing. This model is simple and it has all the properties and capabilities required to process data with storage efficiency. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 19
Data Manipulation Languages The Data Manipulation Language (DML) is used to retrieve, insert and modify database information. These commands will be used by all database users during the routine operation of the database. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 20
Basic DML Commands 1. Insert 2. Select 3. Update 4. Delete MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 21
Basic DML Commands INSERT The INSERT command in SQL is used to add records to an existing table. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 22
Basic DML Commands SELECT The SELECT command is the most commonly used command in SQL. It allows database users to retrieve the specific information they desire from an operational database. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 23
Basic DML Commands UPDATE The UPDATE command can be used to modify information contained within a table, either in bulk or individually. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 24
Basic DML Commands DELETE Finally, let's take a look at the DELETE command. You'll find that the syntax of this command is similar to that of the other DML commands. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 25
Data Independence Data independence is the type of data transparency that matters for a centralized DBMS. It refers to the immunity of user applications to changes made in the definition and organization of data. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 26
Data Independence Data independence can be explained as follows: Each higher level of the data architecture is immune to changes of the next lower level of the architecture. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 27
Data Independence The logical scheme stays unchanged even though the storage space or type of some data is changed for reasons of optimization or reorganization. In this external schema does not change. In this internal schema changes may be required due to some physical schema were reorganized here. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 28
Data Independence Physical data independence is present in most databases and file environment in which hardware storage of encoding, exact location of data on disk, merging of records, so on this are hidden from user. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 29
Data Independence One of the biggest advantage of databases is data independence. It means we can change the conceptual schema at one level without affecting the data at another level. It also means we can change the structure of a database without affecting the data required by users and programs. This feature was not available in the file oriented approach. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 30
Data Independence MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 31
Logical Data Independence Logical data is data about database, that is, it stores information about how data is managed inside. For example, a table (relation) stored in the database and all its constraints, applied on that relation. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 32
Logical Data Independence Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we do some changes on table format, it should not change the data residing on the disk. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 33
Physical Data Independence All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data independence is the power to change the physical data without impacting the schema or logical data. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 34
Data Redundancy Issues Data redundancy is a condition created within a database or data storage technology in which the same piece of data is held in two separate places. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 35
Data Redundancy Issues This can mean two different fields within a single database, or two different spots in multiple software environments or platforms. Whenever data is repeated, this basically constitutes data redundancy. This can occur by accident, but is also done deliberately for backup and recovery purposes. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 36
Data Redundancy Issues Data Integrity Issues 1. Data Integrity 2. Entity Integrity 3. Referential Integrity MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 37
Data Redundancy Issues Data Integrity Data integrity refers to the overall completeness, accuracy and consistency of data. This can be indicated by the absence of alteration between two instances or between two updates of a data record, meaning data is intact and unchanged. Data integrity is usually imposed during the database design phase through the use of standard procedures and rules. Data integrity can be maintained through the use of various error checking methods and validation procedures. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 38
Data Redundancy Issues Entity Integrity This is concerned with the concept of primary keys. The rule states that every table must have its own primary key and that each has to be unique and not null. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 39
Data Redundancy Issues Referential Integrity This is the concept of foreign keys. The rule states that the foreign key value can be in two states. The first state is that the foreign key value would refer to a primary key value of another table, or it can be null. Being null could simply mean that there are no relationships, or that the relationship is unknown. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 40
Data Redundancy Issues Domain Integrity This states that all columns in a relational database are in a defined domain. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 41
Database Schema A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 42
Database Schema A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted by means of schema diagrams. It’s the database designers who design the schema to help programmers understand the database and make it useful. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 43
Database Schema A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted by means of schema diagrams. It’s the database designers who design the schema to help programmers understand the database and make it useful. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 44
Database Schema MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 45
Database Schema A database schema can be divided broadly into two categories: 1. Physical Database Schema 2. Logical Database Schema MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 46
Database Schema Physical Database Schema This schema pertains to the actual storage of data and its form of storage like files, indices, etc. It defines how the data will be stored in a secondary storage. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 47
Database Schema Logical Database Schema This schema defines all the logical constraints that need to be applied on the data stored. It defines tables, views, and integrity constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 48
Relational Data Model Relational data model is the primary data model, which is used widely around the world for data storage and processing. This model is simple and it has all the properties and capabilities required to process data with storage efficiency. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 49
Relational Data Model Concepts: 1. Tables 2. Tuples 3. Relation Instance 4. Relation Schema 5. Relation Key 6. Attribute Domain MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 50
Relational Data Model Tables In relational data model, relations are saved in the format of Tables. This format stores the relation among entities. A table has rows and columns, where rows represents records and columns represent the attributes. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 51
Relational Data Model Tuples A single row of a table, which contains a single record for that relation is called a tuple. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 52
Relational Data Model Relation Instance A finite set of tuples in the relational database system represents relation instance. Relation instances do not have duplicate tuples. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 53
Relational Data Model Relation Schema A relation schema describes the relation name (table name), attributes, and their names. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 54
Relational Data Model Relation Key Each row has one or more attributes, known as relation key, which can identify the row in the relation (table) uniquely. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 55
Relational Data Model Attribute Domain Every attribute has some pre-defined value scope, known as attribute domain. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 56
Relational Data Model Constraints Every relation has some conditions that must hold for it to be a valid relation. These conditions are called Relational Integrity Constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 57
Relational Data Model There are three main integrity constraints 1. Key constraints 2. Domain constraints 3. Referential integrity constraints MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 58
Relational Data Model Key constraints There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This minimal subset of attributes is called key for that relation. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 59
Relational Data Model If there are more than one such minimal subsets, these are called candidate keys. Key constraints force that: • In a relation with a key attribute, no two tuples can have identical values for key attributes. • A key attribute can not have NULL values. Key constraints are also referred to as Entity Constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 60
Relational Data Model Domain Constraints Attributes have specific values in real-world scenario. For example, age can only be a positive integer. The same constraints have been tried to employ on the attributes of a relation. Every attribute is bound to have a specific range of values. For example, age cannot be less than zero and telephone numbers cannot contain a digit outside 0 -9. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 61
Relational Data Model Referential integrity Constraints Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key attribute of a relation that can be referred in other relation. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 62
Relational Data Model Referential Integrity Constraint States that if a relation refers to a key attribute of a different or same relation, then that key element must exist. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 63
Table Fields Relationship The association among entities is called a relationship. For example, an employee works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 64
Table Fields Relationship Set A set of relationships of similar type is called a relationship set. Like entities, a relationship too can have attributes. These attributes are called descriptive attributes. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 65
Table Fields Relationship Degree of Relationship The number of participating entities in a relationship defines the degree of the relationship. 1. Binary = degree 2 2. Ternary = degree 3 3. N-ary = degree MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 66
Table Fields Relationship Mapping Cardinalities Cardinality defines the number of entities in one entity set, which can be associated with the number of entities of other set via relationship set. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 67
Table Fields Relationship Mapping Cardinalities 1. One-to-One 2. One-to-Many 3. Many-to-One 4. Many-to-Many MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 68
Table Fields Relationship One-to-One entity from entity set A can be associated with at most one entity of entity set B and vice versa. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 69
Table Fields Relationship One-to-Many One entity from entity set A can be associated with more than one entities of entity set B however an entity from entity set B, can be associated with at most one entity. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 70
Table Fields Relationship Many-to-One More than one entities from entity set A can be associated with at most one entity of entity set B, however an entity from entity set B can be associated with more than one entity from entity set A. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 71
Table Fields Relationship Many-to-Many One entity from A can be associated with more than one entity from B and vice versa. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 72
Data Dictionary Data dictionary is the centralized collection of information about data. It stores meaning and origin of data, its relationship with other data, data format for usage etc. Data dictionary has rigorous definitions of all names in order to facilitate user and software designers. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 73
Data Dictionary Data dictionary is often referenced as meta-data (data about data) repository. It is created along with DFD (Data Flow Diagram) model of software program and is expected to be updated whenever DFD is changed or updated. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 74
Data Dictionary Database management systems, a file that defines the basic organization of a database. A data dictionary contains a list of all files in the database, the number of records in each file, and the names and types of each field. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 75
Data Dictionary MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 76
Data Dictionary Data dictionary should contain information about the following: 1. Data Flow 2. Data Structure 3. Data Elements 4. Data Stores 5. Data Processing MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 77
END OF THE LESSON MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 78
Approaches L 01: UNDERSTAND DATA MODELS AND DATABASE TECHNOLOGIES MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 79
APPROACHES MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 80
Top Down and Bottom Up Top-down and bottom-up are both strategies of information processing and knowledge ordering, used in a variety of fields including software, humanistic and scientific theories and management and organization. In practice, they can be seen as a style of thinking, teaching, or leadership. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 81
Top Down and Bottom Up Top Down The top down approach spins off data marts for specific groups of users after the complete data warehouse has been created. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 82
Top Down and Bottom Up The bottom up approach builds the data marts first and then combines them into a single, allencompassing data warehouse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 83
Entity Analysis Entity Entities are represented by means of rectangles. Rectangles are named with the entity set they represent. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 84
Entity Analysis Attributes are the properties of entities. Attributes are represented by means of ellipses. Every ellipse represents one attribute and is directly connected to its entity (rectangle). MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 85
Entity Analysis Attributes MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 86
Entity Analysis If the attributes are composite, they are further divided in a tree like structure. Every node is then connected to its attribute. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 87
Entity Analysis That is, composite attributes are represented by ellipses that are connected with an ellipse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 88
Entity Analysis Multivalued attributes are depicted by double ellipse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 89
Entity Analysis Derived attributes are depicted by dashed ellipse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 90
Entity Analysis Relationships are represented by diamond-shaped box. Name of the relationship is written inside the diamond-box. All the entities (rectangles) participating in a relationship, are connected to it by a line. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 91
Entity Analysis Binary Relationship and Cardinality A relationship where two entities are participating is called a binary relationship. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 92
Entity Analysis Binary Relationship and Cardinality is the number of instance of an entity from a relation that can be associated with the relation. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 93
Entity Analysis Binary Relationship and Cardinality 1. One-to-one 2. One-to-many 3. Many-to-one 4. Many-to-many MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 94
Entity Analysis Binary Relationship and Cardinality One-to-one When only one instance of an entity is associated with the relationship, it is marked as '1: 1'. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 95
Entity Analysis The following image reflects that only one instance of each entity should be associated with the relationship. It depicts one-to-one relationship. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 96
Entity Analysis Binary Relationship and Cardinality One-to-many When more than one instance of an entity is associated with a relationship, it is marked as '1: N'. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 97
Entity Analysis The following image reflects that only one instance of entity on the left and more than one instance of an entity on the right can be associated with the relationship. It depicts one-to-many relationship. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 98
Entity Analysis Binary Relationship and Cardinality Many-to-one When more than one instance of entity is associated with the relationship, it is marked as 'N: 1'. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 99
Entity Analysis The following image reflects that more than one instance of an entity on the left and only one instance of an entity on the right can be associated with the relationship. It depicts many-to-one relationship. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 100
Entity Analysis Binary Relationship and Cardinality Many-to-many The following image reflects that more than one instance of an entity on the left and more than one instance of an entity on the right can be associated with the relationship. It depicts many-to-many relationship. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 101
Entity Analysis Participation Constraints 1. Total Participation 2. Partial participation MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 102
Entity Analysis Participation Constraints Total Participation Each entity is involved in the relationship. Total participation is represented by double lines. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 103
Entity Analysis Participation Constraints Partial participation Not all entities are involved in the relationship. Partial participation is represented by single lines. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 104
Entity Analysis Participation Constraints MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 105
Entity-Relationship Diagram (ERDs) ER Model, when conceptualized into diagrams, gives a good overview of entity-relationship, which is easier to understand. ER diagrams can be mapped to relational schema, that is, it is possible to create relational schema using ER diagram. We cannot import all the ER constraints into relational model, but an approximate schema can be generated. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 106
Entity-Relationship Diagram (ERDs) ER diagrams mainly comprise of : 1. Entity and its attributes 2. Relationship, which is association among entities. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 107
Mapping Entity An entity is a real-world object with some attributes. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 108
Mapping Entity Mapping Process (Algorithm) 1. Create table for each entity. 2. Entity's attributes should become fields of tables with their respective data types. 3. Declare primary key. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 109
Mapping Relationship A relationship is an association among entities. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 110
Mapping Relationship Mapping Process: 1. Create table for a relationship. 2. Add the primary keys of all participating Entities as fields of table with their respective data types. 3. If relationship has any attribute, add each attribute as field of table. 4. Declare a primary key composing all the primary keys of participating entities. 5. Declare all foreign key constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 111
Mapping Weak Entity Sets A weak entity set is one which does not have any primary key associated with it. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 112
Mapping Weak Entity Sets Mapping Process 1. Create table for weak entity set. 2. Add all its attributes to table as field. 3. Add the primary key of identifying entity set. 4. Declare all foreign key constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 113
Mapping Hierarchical Entities ER specialization or generalization comes in the form of hierarchical entity sets. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 114
Mapping Hierarchical Entities Mapping Process 1. Create tables for all higher-level entities. 2. Create tables for lower-level entities. 3. Add primary keys of higher-level entities in the table of lower-level entities. 4. In lower-level tables, add all other attributes of lower-level entities. 5. Declare primary key of higher-level table and the primary key for lower-level table. 6. Declare foreign key constraints. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 115
Determinacy Diagrams A determinacy diagram, sometimes known as a dependency diagram, is a diagram which documents the determinacy or dependency between a set of data items. Determinacy diagrams are particularly used as an aid to database normalization. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 116
Determinacy Diagrams A set of Functional Dependencies for a data model can be documented in a Functional Dependency Diagram In a Functional Dependency Diagram each attribute is shown in a rectangle with an arrow indicating the direction of the dependency. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 117
Determinacy Diagrams The figure below illustrates the functional dependency Prod# > Product. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 118
Determinacy Diagrams A Functional Dependency with Multiple Attributes is shown below, for the functional dependency Order#, Prod# > Quantity. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 119
Determinacy Diagrams A derived Functional Dependency involving Partial Key Dependency is shown in the figure below. The arrow connected to the outer rectangle, which represents Order#, Prod# > Product can be deleted without loss of information. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 120
Determinacy Diagrams A derived Functional Dependency involving Transitive Dependency is shown in the figure below. The arrow which represents Order# > Supplier can be deleted without loss of information. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 121
Determinacy Diagrams Rules for Functional Dependency Diagrams 1. Each attribute appears only once on the Functional Dependency Diagram 2. All the attributes of interest appear on the Functional Dependency Diagram 3. No partial key dependencies appear on the Functional Dependency Diagram 4. No transitive dependencies appear on the Functional Dependency Diagram MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 122
Determinacy Diagrams The complete Functional Dependency Diagram for the Purchase Order data model is shown below: MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 123
Data Flow Diagrams Data flow diagram is graphical representation of flow of data in an information system. It is capable of depicting incoming data flow, outgoing data flow and stored data. The DFD does not mention anything about how data flows through the system. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 124
Data Flow Diagrams There is a prominent difference between DFD and Flowchart. The flowchart depicts flow of control in program modules. DFDs depict flow of data in the system at various levels. DFD does not contain any control or branch elements. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 125
Data Flow Diagrams Types of Data Flow Diagram 1. Logical DFD 2. Physical DFD MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 126
Data Flow Diagrams Logical DFD This type of DFD concentrates on the system process, and flow of data in the system. For example in a Banking software system, how data is moved between different entities. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 127
Data Flow Diagrams Physical DFD This type of DFD shows how the data flow is actually implemented in the system. It is more specific and close to the implementation. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 128
Data Flow Diagrams DFD Components DFD can represent Source, destination, storage and flow of data using the following set of components: MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 129
Data Flow Diagrams DFD Components: 1. Entities 2. Process 3. Data Storage 4. Data Flow MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 130
Data Flow Diagrams DFD Components: Entities are source and destination of information data. Entities are represented by a rectangles with their respective names. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 131
Data Flow Diagrams DFD Components: Process Activities and action taken on the data are represented by Circle or Round-edged rectangles. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 132
Data Flow Diagrams DFD Components: Data Storage There are two variants of data storage - it can either be represented as a rectangle with absence of both smaller sides or as an open-sided rectangle with only one side missing. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 133
Data Flow Diagrams DFD Components: Data Flow Movement of data is shown by pointed arrows. Data movement is shown from the base of arrow as its source towards head of the arrow as destination. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 134
Data Flow Diagrams Levels of DFD : Level 0 Highest abstraction level DFD is known as Level 0 DFD, which depicts the entire information system as one diagram concealing all the underlying details. Level 0 DFDs are also known as context level DFDs. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 135
Data Flow Diagrams Levels of DFD : Level 0 MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 136
Data Flow Diagrams Levels of DFD : Level 1 The Level 0 DFD is broken down into more specific, Level 1 DFD depicts basic modules in the system and flow of data among various modules. Level 1 DFD also mentions basic processes and sources of information. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 137
Data Flow Diagrams Levels of DFD : Level 1 MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 138
Data Flow Diagrams Levels of DFD : Level 2 At this level, DFD shows how data flows inside the modules mentioned in Level 1. Higher level DFDs can be transformed into more specific lower level DFDs with deeper level of understanding unless the desired level of specification is achieved. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 139
Functional Dependency Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional dependency says that if two tuples have same values for attributes A 1, A 2, . . . , An, then those two tuples must have to have same values for attributes B 1, B 2, . . . , Bn. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 140
Functional Dependency Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally determines Y. The left-hand side attributes determine the values of attributes on the right-hand side. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 141
Functional Dependency Armstrong's Axioms If F is a set of functional dependencies then the closure of F, denoted as F+, is the set of all functional dependencies logically implied by F. Armstrong's Axioms are a set of rules, that when applied repeatedly, generates a closure of functional dependencies. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 142
Functional Dependency Armstrong's Axioms • Reflexive rule • Augmentation rule • Transitivity rule MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 143
Functional Dependency Armstrong's Axioms Reflexive rule If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 144
Functional Dependency Armstrong's Axioms Augmentation rule If a → b holds and y is attribute set, then ay → by also holds. That is adding attributes in dependencies, does not change the basic dependencies. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 145
Functional Dependency Armstrong's Axioms Transitivity rule Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c also holds. a → b is called as a functionally that determines b. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 146
Functional Dependency Trivial Functional Dependency 1. Trivial 2. Non-trivial 3. Completely non-trivial MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 147
Functional Dependency Trivial Functional Dependency Trivial If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 148
Functional Dependency Trivial Functional Dependency Non-trivial If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 149
Functional Dependency Trivial Functional Dependency Completely non-trivial If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non-trivial FD. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 150
Normalization If a database design is not perfect, it may contain anomalies, which are like a bad dream for any database administrator. Managing a database with anomalies is next to impossible. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 151
Normalization is a method to remove all these anomalies and bring the database to a consistent state. 1. Update anomalies 2. Deletion anomalies 3. Insert anomalies MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 152
Normalization Update anomalies If data items are scattered and are not linked to each other properly, then it could lead to strange situations. For example, when we try to update one data item having its copies scattered over several places, a few instances get updated properly while a few others are left with old values. Such instances leave the database in an inconsistent state. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 153
Normalization Deletion anomalies We tried to delete a record, but parts of it was left undeleted because of unawareness, the data is also saved somewhere else. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 154
Normalization Insert anomalies We tried to insert data in a record that does not exist at all. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 155
Normalization: First Normal Form (1 NF) First Normal Form is defined in the definition of relations (tables) itself. This rule defines that all the attributes in a relation must have atomic domains. The values in an atomic domain are indivisible units. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 156
Normalization: First Normal Form (1 NF) We re-arrange the relation (table) as below, to convert it to First Normal Form. Each attribute must contain only a single value from its pre-defined domain. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 157
Normalization: Second Normal Form (2 NF) Before we learn about the second normal form, we need to understand the following: 1. Prime attribute 2. Non-prime attribute MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 158
Normalization: Second Normal Form (2 NF) Prime attribute An attribute, which is a part of the prime-key, is known as a prime attribute. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 159
Normalization: Second Normal Form (2 NF) Non-prime attribute An attribute, which is not a part of the prime-key, is said to be a non-prime attribute. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 160
Normalization: Second Normal Form (2 NF) If we follow second normal form, then every nonprime attribute should be fully functionally dependent on prime key attribute. That is, if X → A holds, then there should not be any proper subset Y of X, for which Y → A also holds true. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 161
Normalization: Second Normal Form (2 NF) We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to the rule, non-key attributes, i. e. Stu_Name and Proj_Name must be dependent upon both and not on any of the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This is called partial dependency, which is not allowed in Second Normal Form. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 162
Normalization: Second Normal Form (2 NF) We broke the relation in two as depicted in the above picture. So there exists no partial dependency. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 163
Normalization: Third Normal Form (3 NF) For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy: • No non-prime attribute is transitively dependent on prime key attribute. • For any non-trivial functional dependency, X → A, then either − ◦ X is a superkey or, ◦ A is prime attribute. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 164
Normalization: Third Normal Form (3 NF) We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists transitive dependency. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 165
Normalization: Second Normal Form (2 NF) To bring this relation into third normal form, we break the relation into two relations as follows: MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 166
END OF THE LESSON MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 167
New Developments L 01: UNDERSTAND DATA MODELS AND DATABASE TECHNOLOGIES MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 168
NEW DEVELOPMENTS MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 169
Dynamic Storage A dynamic database management system (dynamic DBMS) is a database in which objects have a valuebased relationship, which is specified at retrieval time. In a dynamic DBMS, the locations of logical file databases and relational data based are valuebased. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 170
Data Mining Data mining is an interdisciplinary subfield of computer science. It is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 171
Data Mining The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 172
Data Warehousing Data warehousing emphasizes the capture of data from diverse sources for useful analysis and access, but does not generally start from the point-of-view of the end user who may need access to specialized, sometimes local databases. The latter idea is known as the data mart. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 173
Data Warehousing There are two approaches to data warehousing, top down and bottom up. 1. The top down approach spins off data marts for specific groups of users after the complete data warehouse has been created. 2. The bottom up approach builds the data marts first and then combines them into a single, allencompassing data warehouse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 174
Data Warehousing Data Warehouse Features 1. Subject Oriented 2. Integrated 3. Time Variant MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 175
Data Warehousing Subject Oriented A data warehouse is subject oriented because it provides information around a subject rather than the organization's ongoing operations. These subjects can be product, customers, suppliers, sales, revenue, etc. A data warehouse does not focus on the ongoing operations, rather it focuses on modelling and analysis of data for decision making. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 176
Data Warehousing Integrated A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases, flat files, etc. This integration enhances the effective analysis of data. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 177
Data Warehousing Time Variant The data collected in a data warehouse is identified with a particular time period. The data in a data warehouse provides information from the historical point of view. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 178
Data Warehousing Types of Data Warehouse 1. Information Processing 2. Analytical Processing 3. Data Mining 4. Non-volatile MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 179
Data Warehousing Information Processing A data warehouse allows to process the data stored in it. The data can be processed by means of querying, basic statistical analysis, reporting using crosstabs, tables, charts, or graphs. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 180
Data Warehousing Analytical Processing A data warehouse supports analytical processing of the information stored in it. The data can be analyzed by means of basic OLAP operations, including slice-and-dice, drill down, drill up, and pivoting. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 181
Data Warehousing Data Mining Data mining supports knowledge discovery by finding hidden patterns and associations, constructing analytical models, performing classification and prediction. These mining results can be presented using the visualization tools. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 182
Data Warehousing Non-volatile means the previous data is not erased when new data is added to it. A data warehouse is kept separate from the operational database and therefore frequent changes in operational database is not reflected in the data warehouse. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 183
Web Enabled Database Application Considering the widespread use of the Internet and Web Browser technology to deliver marketing messages, promotional and informational material, it's surprising that we still see very little use of what the Web is best at - providing interactive access to business information, making queries, placing orders and updating records via Net and browser services. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 184
Web Enabled Database Application In other words, standard database facilities but accessed remotely. That's what we call a webenabled database. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 185
Web Enabled Database Application The basic components of a web-enabled database like the one in the diagram are: 1. A permanent link to the Internet 2. A webserver 3. A firewall 4. Web pages and software to deliver the active application MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 186
Multimedia Databases A Multimedia database (MMDB) is a collection of related multimedia data. The multimedia data include one or more primary media data types such as text, images, graphic objects (including drawings, sketches and illustrations) animation sequences, audio and video. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 187
Multimedia Databases A Multimedia Database Management System (MMDBMS) is a framework that manages different types of data potentially represented in a wide diversity of formats on a wide array of media sources. It provides support for multimedia data types, and facilitate for creation, storage, access, query and control of a multimedia database. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 188
Document Management Systems A document management system (DMS) is a system (based on computer programs in the case of the management of digital documents) used to track, manage and store documents and reduce paper. Most are capable of keeping a record of the various versions created and modified by different users (history tracking). MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 189
Digital Libraries A digital library is a special library with a focused collection of digital objects that can include text, visual material, audio material, video material, stored as electronic media formats (as opposed to print, microform, or other media), along with means for organizing, storing, and retrieving the files and media contained in the library collection. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 190
Digital Libraries Digital libraries can vary immensely in size and scope, and can be maintained by individuals, organizations, or affiliated with established physical library buildings or institutions, or with academic institutions. The digital content may be stored locally, or accessed remotely via computer networks. An electronic library is a type of information retrieval system. MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 191
END OF THE LESSON MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 192
References Codd, E. F (1969), Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks, Research Report, IBM Data Integration Glossary (PDF), US: Department of Transportation, August 2001 Codd, E. F (1990), The Relational Model for Database Management, Addison-Wesley, pp. 371– 388, ISBN 0 -201 -14192 -2 MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 193
Further Reading C. J. ; Darwen, Hugh (2000). Foundation for future database systems: the third manifesto; a detailed study of the impact of type theory on the relational model of data, including a comprehensive model of type inheritance (2 ed. ). Reading, MA: Addison-Wesley. ISBN 0 -201 -70928 -7. Darwin, Hugh (2007). An Introduction to Database Systems (8 ed. ). Boston: Pearson Education. ISBN 0 -321 -19784 -4 MICRONET INTERNATIONAL COLLEGE | NETWORK SECURITY 194
- Slides: 194