DW Components • Data migration tools – Tools that help extract, transform and load data into the data warehouse – Three main categories • Data copying and replication • Data transformation • Data cleansing • Metadata usage – Administrative and end-user
DW Components • Warehouse data stores – Structures used to actually store the data – Typically relational DB (but not always) – Multi-dimensional DB becoming more popular • Data retrieval, formatting and analysis – Query tools – Analysis tools – Data mining • Management tools – Access control, performance monitoring, usage monitoring
Steps in Data Warehousing 1. 2. 3. 4. Identify all the sources of data Design the data warehouse Extract/Transform/Load process Decision making from DW
DW Design: Star Schema Lots of records, but each record is “thin” “Fact” Want to be able to “see” each Sale by Product, Time, Store “Dimensions” Fewer records, but each record is “fat” (lots of big columns)
E/T/L Process • Extract – Data must be extracted from source systems • Transform – Cleansing • Identify and eliminate data inconsistencies • Can be very complex, expensive, time consuming – Aggregation • Load – Must be repeated periodically – How often? – Identifying changed data