Data Warehouse Characteristics 1 Subject Oriented Data is
Data Warehouse Characteristics 1. Subject – Oriented Data is categorized and stored by business subject rather than by application OLTP Applications Savings Credit/Loans Consumer Time Deposit Banking Data Warehouse Subject Customer/Account Information
Data Warehouse Characteristics 2. Integrated Integrate the data into a single set of meaningful data for analysis. Savings Account Tips: 1. Data Consistency 2. Data Non-Redundancy Customer Loans
Data Warehouse Characteristics 3. Time Variant Data warehouse is by nature historical; data is retained for a long time. Case Practice: 2 – 10 years Tips: 1. Data summarization 2. Strategy for data purging/backup 3. Strategy for data cleansing
Data Warehouse Characteristics 4. Nonvolatile The changes of data in data warehouse are least (less volatile than operational data. Data warehouse operations typically involve: • Loading the initial data (first-time load) • Refreshing the data regularly Data Warehouse Load Insert, Update, Delete, Read
Data Warehouse Features A Data Warehouse: • Is a repository • Improves access to integrated data • Ensures integrity and quality • Provide an historical perspective • Records results • Is used by abroad spectrum of end users for a variety of purposes • Reduces the reporting and analysis impact on operational systems • Requires a major systems integration efforts
Basic Element Of Data Warehouse Source Data Staging Area Target • Source system is an operational system of record whose function it is to capture the transactions of the business • Data staging area is a storage area and set of processes that clean, transform, combine, de-duplicate, archive. And prepare source data for use in the data warehouse • Target is data warehouse
Complete Element of Data Warehouse Source Data Staging Systems Area Data Warehouse End User Server Data Access Ad-Hoc Query Data Mart 1 Storage: Flat files (fastest) RDBMS Other extract Processing: Clean Combine Remove duplicates Standardize Archive Export to data marts Report Writers populate Data Mart 2 feed Model: Forecasting Data Mining Data Mart 3
Desirable DWH Functionality Design Scalability Manageability Availability Extensibility Load Data Access Architecture Metadata Flexibility Tuning and Optimization Operation Management and Revision And Integrated Accessibility Reliability Capacity Planning and Sizing Management
Data Warehouse Process Extraction Transformation/Load Flat Files ETL B 2 B Enterprise Model Click. Stream External Server Log Files Metadata Repository portal B 2 C Access layer(s) Transformation ETL Staging Area(s) Operational RDBMS Publish
- Slides: 24