Virtual University of Pakistan Data Warehousing Lecture3 Introduction
- Slides: 12
Virtual University of Pakistan Data Warehousing Lecture-3 Introduction and Background Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www. nu. edu. pk/cairindex. asp FAST National University of Computers & Emerging Sciences, Islamabad DWH-Ahsan Abdullah 1
Introduction and Background 2 DWH-Ahsan Abdullah
What is a Data Warehouse ? It is a blend of many technologies, the basic concept being: Take all data from different operational systems. n If necessary, add relevant data from industry. n Transform all data and bring into a uniformat. n Integrate all data as a single entity. n 3 DWH-Ahsan Abdullah
What is a Data Warehouse ? (Cont…) It is a blend of many technologies, the basic concept being: Store data in a format supporting easy access for decision support. n Create performance enhancing indices. n Implement performance enhancement joins. n Run ad-hoc queries with low selectivity. n 4 DWH-Ahsan Abdullah
How is it Different? § Fundamentally different Business user needs info Answers result in more questions User requests IT people ? Business user may get answers IT people send reports to business user IT people do system analysis and design IT people create reports 5 DWH-Ahsan Abdullah
How is it Different? § Different patterns of hardware utilization 100% 0% Operational DWH Bus Service vs. Train 6 DWH-Ahsan Abdullah
How is it Different? § Combines operational and historical data. § Don’t do data entry into a DWH, OLTP or ERP are the source systems. § OLTP systems don’t keep history, cant get balance statement more than a year old. § DWH keep historical data, even of bygone customers. Why? § In the context of bank, want to know why the customer left? § What were the events that led to his/her leaving? Why? § Customer retention. 7 DWH-Ahsan Abdullah
How much history? § Depends on: § Industry. § Cost of storing historical data. § Economic value of historical data. 8 DWH-Ahsan Abdullah
How much history? § Industries and history § Telecomm calls are much more as compared to bank transactions- 18 months. § Retailers interested in analyzing yearly seasonal patterns- 65 weeks. § Insurance companies want to do actuary analysis, use the historical data in order to predict risk- 7 years. 9 DWH-Ahsan Abdullah
How much history? Economic value of data Vs. Storage cost Data Warehouse a complete repository of data? 10 DWH-Ahsan Abdullah
How is it Different? § Usually (but not always) periodic or batch updates rather than real-time. § The boundary is blurring for active data warehousing. § For an ATM, if update not in real-time, then lot of real trouble. § DWH is for strategic decision making based on historical data. Wont hurt if transactions of last one hour/day are absent. 11 DWH-Ahsan Abdullah
How is it Different? § Rate of update depends on: § volume of data, § nature of business, § cost of keeping historical data, § benefit of keeping historical data. 12 DWH-Ahsan Abdullah
- Introduction to data mining and data warehousing
- Difference between operational and informational data
- Introduction to data warehouse
- Data mining in data warehouse
- Javachive
- Crm data warehouse models
- Olap in data mining
- Best practices for data warehousing
- Coffing data warehousing
- Data warehouse component
- Data warehouse project plan
- Temporal parallelism
- Data warehousing principles