Data Warehouse Characteristics 1 Subject Oriented Data is


Data Warehouse Characteristics 1. Subject – Oriented Data is categorized and stored by business subject rather than by application OLTP Applications Savings Credit/Loans Consumer Time Deposit Banking Data Warehouse Subject Customer/Account Information



Data Warehouse Characteristics 2. Integrated Integrate the data into a single set of meaningful data for analysis. Savings Account Tips: 1. Data Consistency 2. Data Non-Redundancy Customer Loans


Data Warehouse Characteristics 3. Time Variant Data warehouse is by nature historical; data is retained for a long time. Case Practice: 2 – 10 years Tips: 1. Data summarization 2. Strategy for data purging/backup 3. Strategy for data cleansing


Data Warehouse Characteristics 4. Nonvolatile The changes of data in data warehouse are least (less volatile than operational data. Data warehouse operations typically involve: • Loading the initial data (first-time load) • Refreshing the data regularly Data Warehouse Load Insert, Update, Delete, Read










Data Warehouse Features A Data Warehouse: • Is a repository • Improves access to integrated data • Ensures integrity and quality • Provide an historical perspective • Records results • Is used by abroad spectrum of end users for a variety of purposes • Reduces the reporting and analysis impact on operational systems • Requires a major systems integration efforts

Basic Element Of Data Warehouse Source Data Staging Area Target • Source system is an operational system of record whose function it is to capture the transactions of the business • Data staging area is a storage area and set of processes that clean, transform, combine, de-duplicate, archive. And prepare source data for use in the data warehouse • Target is data warehouse

Complete Element of Data Warehouse Source Data Staging Systems Area Data Warehouse End User Server Data Access Ad-Hoc Query Data Mart 1 Storage: Flat files (fastest) RDBMS Other extract Processing: Clean Combine Remove duplicates Standardize Archive Export to data marts Report Writers populate Data Mart 2 feed Model: Forecasting Data Mining Data Mart 3

Desirable DWH Functionality Design Scalability Manageability Availability Extensibility Load Data Access Architecture Metadata Flexibility Tuning and Optimization Operation Management and Revision And Integrated Accessibility Reliability Capacity Planning and Sizing Management

Data Warehouse Process Extraction Transformation/Load Flat Files ETL B 2 B Enterprise Model Click. Stream External Server Log Files Metadata Repository portal B 2 C Access layer(s) Transformation ETL Staging Area(s) Operational RDBMS Publish

- Slides: 24