Virtual University of Pakistan Data Warehousing Lecture3 Introduction

  • Slides: 12
Download presentation
Virtual University of Pakistan Data Warehousing Lecture-3 Introduction and Background Ahsan Abdullah Assoc. Prof.

Virtual University of Pakistan Data Warehousing Lecture-3 Introduction and Background Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www. nu. edu. pk/cairindex. asp FAST National University of Computers & Emerging Sciences, Islamabad DWH-Ahsan Abdullah 1

Introduction and Background 2 DWH-Ahsan Abdullah

Introduction and Background 2 DWH-Ahsan Abdullah

What is a Data Warehouse ? It is a blend of many technologies, the

What is a Data Warehouse ? It is a blend of many technologies, the basic concept being: Take all data from different operational systems. n If necessary, add relevant data from industry. n Transform all data and bring into a uniformat. n Integrate all data as a single entity. n 3 DWH-Ahsan Abdullah

What is a Data Warehouse ? (Cont…) It is a blend of many technologies,

What is a Data Warehouse ? (Cont…) It is a blend of many technologies, the basic concept being: Store data in a format supporting easy access for decision support. n Create performance enhancing indices. n Implement performance enhancement joins. n Run ad-hoc queries with low selectivity. n 4 DWH-Ahsan Abdullah

How is it Different? § Fundamentally different Business user needs info Answers result in

How is it Different? § Fundamentally different Business user needs info Answers result in more questions User requests IT people ? Business user may get answers IT people send reports to business user IT people do system analysis and design IT people create reports 5 DWH-Ahsan Abdullah

How is it Different? § Different patterns of hardware utilization 100% 0% Operational DWH

How is it Different? § Different patterns of hardware utilization 100% 0% Operational DWH Bus Service vs. Train 6 DWH-Ahsan Abdullah

How is it Different? § Combines operational and historical data. § Don’t do data

How is it Different? § Combines operational and historical data. § Don’t do data entry into a DWH, OLTP or ERP are the source systems. § OLTP systems don’t keep history, cant get balance statement more than a year old. § DWH keep historical data, even of bygone customers. Why? § In the context of bank, want to know why the customer left? § What were the events that led to his/her leaving? Why? § Customer retention. 7 DWH-Ahsan Abdullah

How much history? § Depends on: § Industry. § Cost of storing historical data.

How much history? § Depends on: § Industry. § Cost of storing historical data. § Economic value of historical data. 8 DWH-Ahsan Abdullah

How much history? § Industries and history § Telecomm calls are much more as

How much history? § Industries and history § Telecomm calls are much more as compared to bank transactions- 18 months. § Retailers interested in analyzing yearly seasonal patterns- 65 weeks. § Insurance companies want to do actuary analysis, use the historical data in order to predict risk- 7 years. 9 DWH-Ahsan Abdullah

How much history? Economic value of data Vs. Storage cost Data Warehouse a complete

How much history? Economic value of data Vs. Storage cost Data Warehouse a complete repository of data? 10 DWH-Ahsan Abdullah

How is it Different? § Usually (but not always) periodic or batch updates rather

How is it Different? § Usually (but not always) periodic or batch updates rather than real-time. § The boundary is blurring for active data warehousing. § For an ATM, if update not in real-time, then lot of real trouble. § DWH is for strategic decision making based on historical data. Wont hurt if transactions of last one hour/day are absent. 11 DWH-Ahsan Abdullah

How is it Different? § Rate of update depends on: § volume of data,

How is it Different? § Rate of update depends on: § volume of data, § nature of business, § cost of keeping historical data, § benefit of keeping historical data. 12 DWH-Ahsan Abdullah