Data Warehousing Databases support Transaction Processing Systems operational

Data Warehousing

Databases support: • Transaction Processing Systems – operational level decision – recording of transactions • Decision Support Systems – tactical and strategic decision making – analysis of historical records

Can one database support both? DSS RDBMS TPS

Can one database support both? DSS RDBMS TPS • low concurrency • high concurrency • large reads • small transactions • significant aggregation • limited aggregation Yes… but at a cost in performance.

The Solution… TPS DSS Production Database (OLTP) Data Warehouse Extract, Transport & Transformation Load

OLTP vs DW Characteristics OLTP Database Data Warehouse High Read/Write Concurrency Primarily Read Only Highly Normalized Highly Denormalized Limited Transaction History Massive Transaction History Very Detailed Data Detailed and Summarized Data Limited External Data Significant External Data

Data Marts (3 -tier approach) External Data Sources Production Database (OLTP) Data Mart A Data Warehouse ETL Transformation & Limitation Data Mart B Data Mart C DSS DSS

Data Marts (bottom-up approach) External Data Sources Data Mart A ETL Production Database (OLTP) ETL External Data Sources Data Mart B Data Mart C DSS DSS

Multi-dimensional (Sales) Data 25 40 90 50 30 70 55 60 35 Lime Soda 60 Orange Soda Arizona 110 Diet Soda Utah 80 Soda California March 3 March 2 March 1

Cube Operations • • • Cube (group by option) Slice (implement in Oracle with where clause) Drill Down (implemented in report writers) Roll-up (group by option) Pivot (not implemented by Oracle (but by Access))

Cube Data Example Create table sales ( Item varchar 2(20), State varchar 2(20), Amount number(6), Day date); Insert into Sales values('Soda', 'California', 80, '01 -Mar-2004'); Insert into Sales values('Diet Soda', 'California', 110, '01 -Mar-2004'); …

Examine these queries Select * from sales; Select Item, State, sum(amount) from sales group by Item, State; Select Item, State, sum(amount) from sales group by Rollup(Item, State); Select State, Item, sum(amount) from sales group by Rollup(State, Item); Select State, Item, sum(amount) from sales group by Cube(State, Item);

Materialized Views We looked at Materialized views earlier this year. Materialized views are one of the primary tools for Data Warehousing in Oracle. Recall the materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. Unlike, regular views, they are not constructed when requested (like a query) but are actually materialized in secondary storage making them much faster. In data warehouses, materialized views are used to precompute and store aggregated data such as sums and averages. Materialized views in these environments are typically referred to as summaries because they store summarized data.

RDBMS Star Schema A star schema presents a set of tables that are centered around an individual event of interest. This makes it easy for users of mining and reporting applications to analyze the event across many dimensions. Item Store Item. ID Store. ID Name Unit. Price Sales Brand Sales. NO Category Sales. Units Sales. Dollars Manager Street City Zip Sales. Cost Item. ID Customer Cust. ID Day Cust. ID Store. ID Day. ID Name Day. ID Day. Of. Month Phone Month Street Year City Day. Of. Week
- Slides: 14