IST 722 Data Warehousing SSIS Demo Michael A
- Slides: 14
IST 722 Data Warehousing SSIS Demo Michael A. Fudge, Jr.
Recall: Kimball Lifecycle
SSIS Demo… • Using the Fudgemart Employee Timesheets Dimensional Model from before
High-Level Source to Target Map SOURCE STAGE Employee_timesheets DW stg. Fudgemart. Employee. Timesheets Fudgemart Fact. Employee. Timesheets Employees Supervisors stg. Fudgemart. Employees Dim. Employees Stage External Sources stg. Date Dimension STAGING Stage_Employee. Time. Sheets. dtsx “Truncate and Load Patten” DW Dim. Date LOAD TO DW DW_Employee. Timesheets. dtsx “Type 1/2 SCD Pattern”
The ETL Packages 1. Date. Dimension. Import. dtsx • Imports the Date Dimension. (One time deal) • One package to go from source to stage to target. 2. Stage_Employee. Timesheets. dtsx • Stage Dimension and Fact Data as-is using the truncate and load pattern. 3. DW_Employee. Timesheets. dtsx • Transform staged data into the required Dimensions and Facts. • Load with Type 2 or 1 SCD pattern, as to not re-process the same data. 4. Package. dtsx • Combine steps 1 -3 into one package.
1. Date. Dimension. Import. dtsx • We will walk through how it works. • We’ll skip making it because it’s covered in the lab!
2. Stage_Employee. Timesheets. dtsx Staging Process for Truncate and load: 1. Data Flow – From Source to Stage 2 1 1. Source – Use an SQL Command to match target attributes 2. Target – Create new staged table and import data as-is. 1. 1 2. Include an SQL task to truncate the table before import 3. Repeat for each Source to Stage 1. 2 3
Demo: Stage Fudgemart Timesheet Data
3. DW_Employee. Timesheets. dtsx Type 2 SCD Processing of Dimensions 1. Data Flow from Stage To Dimension 1. Load Data Source from Stage 2. Transform Data from Source to Match Target 3. Lookup Surrogate Key Pipeline 4. Process changes using SCD Type 2 • Repeat these steps for each dimension • Steps 1. 2 and 1. 3 vary based on the dimension 1. 1 1. 2 1. 3 1. 4 1
Demo: DW Fudgemart Timesheet Data Dimension Processing
3. DW_Employee. Timesheets. dtsx Type 1 SCD Processing of Facts. 2. Data Flow from Stage To Fact 1. Load Data source from stage 2. Transform Data from Source to Match Target – Calculate facts. 3. Lookup Surrogate Key Pipeline 4. Process changes using SCD Type 2 • All steps are required for most fact tables. 2 2. 1 2. 2 2. 3 2. 4
4. Package. dtsx • Once package to execute the others. • This package would get scheduled to execute on a routine basis. • Production Changes: • No Date Dimension • Do not stage all data, but stage based on last processed.
Demo: Create Main Package
IST 722 Data Warehousing SSIS Demo Michael A. Fudge, Jr.
- Ssis-722
- Sumyslice
- A kingdom divided
- Grammer 722
- Grammar
- What is data mining and data warehousing
- Hive
- Datamart olap
- Olap data warehouse
- Best practices data warehousing
- Introduction to data warehousing and data mining
- Informational data store in data warehouse
- Greenplum data warehousing
- Data warehouse component
- Data warehouse project plan