Department of Informatics University of Rijeka Radmile Mateji

  • Slides: 17
Download presentation
Department of Informatics, University of Rijeka Radmile Matejčić 2, 51000 Rijeka, Croatia http: //www.

Department of Informatics, University of Rijeka Radmile Matejčić 2, 51000 Rijeka, Croatia http: //www. inf. uniri. hr Data Warehouse design models in higher education courses Patrizia Poščić, Associate Professor patrizia@inf. uniri. hr Danijela Subotić, Teaching Assistant dsubotic@inf. uniri. hr

Overview • Introduction • DW architecture • Modeling practices – Entity-relationship model – Data

Overview • Introduction • DW architecture • Modeling practices – Entity-relationship model – Data Vault model – Dimensional model • Conclusion 2

Introduction • Selected Topics in Databases • Graduate study, 1 st year • Data

Introduction • Selected Topics in Databases • Graduate study, 1 st year • Data warehouse (DW) design as a topic • Integrating several data modeling practices for complete DW design • Practical assignment at the end of the semester 3

DW architecture 4

DW architecture 4

Modeling practices • Modeling of existing database (DB) sources – Entity-relationship model – Relational

Modeling practices • Modeling of existing database (DB) sources – Entity-relationship model – Relational model • Modeling enterprise data warehouse (EDW) as system of records – Data Vault model • Modeling data marts (DM) – Dimensional model 5

Business case • We use a business case which deals with a DW for

Business case • We use a business case which deals with a DW for the outdoor and adventure equipment sales company • All data model examples (which are shown on following slides) are made in Erwin 9. 5 and are based on IDEF 1 X 6

Entity-Relationship (ER) model Sales DB Marketing DB 7

Entity-Relationship (ER) model Sales DB Marketing DB 7

Data Vault model • A data modeling method that supports design of data warehouses

Data Vault model • A data modeling method that supports design of data warehouses for long-term storage of historical data collected from various data sources • Based on the assumption that the DW environment is in constant change • It highlights the need for tracking the origin of data contained in the database, through empirically defined set of metadata • Enables tracking the value back to the source and tracking the history of changes 8

Data Vault model • There is no difference between good and bad data -

Data Vault model • There is no difference between good and bad data - all the data is stored at all times, regardless of whether they are adaptable to business rules - avoiding the loss of information • The structural data are explicitly separated from descriptive attributes, regardless of whether they come from the same source • Model flexible to changes in business environment • Allows for a gap analysis and trend projections 9

Data Vault model • Any change is implemented in the model as an independent

Data Vault model • Any change is implemented in the model as an independent extension of the existing model: – the changes do not affect current applications – all versions of the application can be based on the same, developing DB – all versions of the model are a subset of the DV model • Enables fast parallel loading which reduces the overall costs • Aiming at flexibility and performance 10

Data Vault model • Hub • Link • Satellite 11

Data Vault model • Hub • Link • Satellite 11

Data Vault model 12

Data Vault model 12

Data Vault model (main advantages) • Inserts, deletes, or updates of rows are implemented

Data Vault model (main advantages) • Inserts, deletes, or updates of rows are implemented only as additions (nothing ever get lost/overwritten) • Structural changes of and in data sources results in model expansion, principally by new links and without structural reconstruction of existing DW elements (architectural stability) • Enables rapid parallel data loads 13

Dimensional model • Practically universally used for DM design presentation • Distinguished by star

Dimensional model • Practically universally used for DM design presentation • Distinguished by star schema design – centralized fact table, which contains a multi-layered keys and one or more numerical business measures – fact (set of measurement) needs to be tracked for a lowest granularity of data – fact is surrounded with a rich context of dimensions – dimension tables are denormalized, they have a simple key and they store business attributes in the form of textual information 14

Dimensional model 15

Dimensional model 15

Conclusion • We presented a set of complementary data warehouse design models which may

Conclusion • We presented a set of complementary data warehouse design models which may enable well integrated DW solutions for relational DB implementations • Models based on a common notation (IDEF 1 X) and in a single design tool (Er. Win) • Our goal is to present students with a compact set of modelling knowledge in the field of DB and DW • Upgrade and further develop theoretical knowledge and practical modelling skills through the educational process 16

Thank You for your attention!

Thank You for your attention!