OLTP DW OLAP DW Data Marts Sources OLTP

  • Slides: 119
Download presentation

OLTP – DW - OLAP DW Data Marts Sources OLTP συστήματα OLAP εργαλεία 13

OLTP – DW - OLAP DW Data Marts Sources OLTP συστήματα OLAP εργαλεία 13

Εννοιολογική εναρμόνιση Source 1: Personnel (Cobol) EMP ID Name 110 Kostas 1/1/72 … …

Εννοιολογική εναρμόνιση Source 1: Personnel (Cobol) EMP ID Name 110 Kostas 1/1/72 … … Source 2: Accounting (DB 2) EMP ID Name 110 Kostas 30 120 Mitsos 48 130 Roula Salary Total Dept. ID Income 1500 1200 132 … EMP ID IL_ID Amount 110 10 1500 110 30 300 EMP INCOME IL_ID Descr Age 29 Do. B EMP 10 Μισθός 20 Επίδομα Τέκνων 30 Φόρος . . . Income Lookup 15

Παράδειγμα Αποθήκης Δεδομένων – Telecom Co. Inventory Personnel Suppliers DW Accounting Sales Marketing Churn

Παράδειγμα Αποθήκης Δεδομένων – Telecom Co. Inventory Personnel Suppliers DW Accounting Sales Marketing Churn www. stocks. gr 22

Αρχιτεκτονική της Αποθήκης Δεδομένων Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources

Αρχιτεκτονική της Αποθήκης Δεδομένων Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data Marts DW Quality Issues Administrator End User Administrator Designer 23

Αποθήκη Δεδομένων Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data

Αποθήκη Δεδομένων Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data Marts DW Quality Issues Administrator End User Administrator Designer 26

Servers & Τεχνολογικές λύσεις DW: Σχεσιακά και επεκτεταμένα σχεσιακά DBMS OLAP: Relational OLAP (ROLAP)

Servers & Τεχνολογικές λύσεις DW: Σχεσιακά και επεκτεταμένα σχεσιακά DBMS OLAP: Relational OLAP (ROLAP) Multidimensional OLAP (MOLAP) 28

Πλάνο και στατιστικά από ένα ROLAP εργαλείο select a 3. EKSAM_FOIT_CODE, max(a 3. DESCR)

Πλάνο και στατιστικά από ένα ROLAP εργαλείο select a 3. EKSAM_FOIT_CODE, max(a 3. DESCR) DESCR, a 2. SEX, (SUM(a 1. FOO 1)) M 0000000 from FACT 1 a 1, FOITITIS a 2, EKSAM_FOIT a 3 where a 2. FOITITIS_CODE = a 1. FOITITIS_CODE and a 1. EKSAM_FOIT_CODE = a 3. EKSAM_FOIT_CODE and (((((((a 2. SEX = '1')) and ((EXISTS (select * from EKSAM_FOIT m 1 where m 1. EKSAM_FOIT_CODE = a 3. EKSAM_FOIT_CODE and m 1. CATEGORY = 'ΕΑΡΙΝΟ')))) or (((a 2. SEX = '2')) and ((EXISTS (select * from EKSAM_FOIT m 1 where m 1. EKSAM_FOIT_CODE = a 3. EKSAM_FOIT_CODE and m 1. CATEGORY = 'ΕΑΡΙΝΟ'))))) or (((a 2. SEX = '1')) and ((EXISTS (select * from EKSAM_FOIT m 1 where m 1. EKSAM_FOIT_CODE = a 3. EKSAM_FOIT_CODE and m 1. CATEGORY = 'ΧΕΙΜΕΡΙΝΟ'))))) or (((a 2. SEX = '2')) and ((EXISTS (select * from EKSAM_FOIT m 1 where m 1. EKSAM_FOIT_CODE = a 3. EKSAM_FOIT_CODE and m 1. CATEGORY = 'ΧΕΙΜΕΡΙΝΟ')))))) group by a 3. EKSAM_FOIT_CODE, a 2. SEX PERFORMANCE METRICS (Seconds) Loading Parameters: 0, 0 SQL Generation: 0, 4 Executing Query: 0, 3 Results Processing: 0, 8 --------------------------------------Total Machine Time: 1, 5 Rows returned from Database : 24 31

R EG IO N Πολυδιάστατοι πίνακες PRODUCT N S Juice 10 Cola 13 W

R EG IO N Πολυδιάστατοι πίνακες PRODUCT N S Juice 10 Cola 13 W Soap Jan MONTH 33

Πολυδιάστατο μοντέλο δεδομένων Sales volume gio n Διαστάσεις: Product, Region, Date Product Re Ιεραρχίες

Πολυδιάστατο μοντέλο δεδομένων Sales volume gio n Διαστάσεις: Product, Region, Date Product Re Ιεραρχίες διαστάσεων: Industry Country Year Category Region Quarter Month Product City Store Week Month Day 38

Παράδειγμα σχήματος στο Μοντέλο Διαστάσεων R. Kimball, A Dimensional Modeling Manifesto, DBMS 40 1997

Παράδειγμα σχήματος στο Μοντέλο Διαστάσεων R. Kimball, A Dimensional Modeling Manifesto, DBMS 40 1997 Magazine, Aug.

Παράδειγμα του αντίστοιχου ER σχήματος R. Kimball, A Dimensional Modeling Manifesto, DBMS 41 1997

Παράδειγμα του αντίστοιχου ER σχήματος R. Kimball, A Dimensional Modeling Manifesto, DBMS 41 1997 Magazine, Aug.

Star Schema Time Code Quarter Name Month Code Month Name Date Account Code Key.

Star Schema Time Code Quarter Name Month Code Month Name Date Account Code Key. Account. Name Account Type Account Market Geography SALES Geography Code Time Code Account Code Product Code Dollar Amount Units Stanford Technology Group, Inc. , 1996 Geography Code Region Manager State Code City Code. . . Product Code Product Name Brand Code Brand Name Prod. Line Code Prod. Line Name 44

Account attributes Account Code Account. Name Key. Account Key. Acc Code Key. Acc Name

Account attributes Account Code Account. Name Key. Account Key. Acc Code Key. Acc Name Snowflake Schema Account Code Key. Account Code Region Mgr State Code State Name Geography Postal Code Region Code State Code City Code City Name Stanford Technology Group, Inc. , 1996 SALES Postal Code Time Code Account Code Product Code Dollar Amount Units Time Code Quarter Code Month Code Quarter Name Month Code Month Name Product Code Prod Line Code Brand Code Product. Name Brand Code Brand Name Prod. Line. Code Prod. Line. Name 48

Λειτουργικές Διαδικασίες - Data Warehouse Back-End Reporting / OLAP tools Quality Issues Metadata Repository

Λειτουργικές Διαδικασίες - Data Warehouse Back-End Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data Marts DW Quality Issues Administrator End User Administrator Designer 57

Αποκανονικοποίηση - Denormalization IL_ID Descr EMP ID IL_ID Amount 110 10 1500 110 30

Αποκανονικοποίηση - Denormalization IL_ID Descr EMP ID IL_ID Amount 110 10 1500 110 30 300 EMP ID Name 110 Kostas 30 120 Mitsos 48 130 Roula EMP INCOME Age 10 Μισθός 20 Επίδομα Τέκνων 30 Φόρος . . . Income Lookup EMP DW 29 DW. EMP ? 65

Καθαρισμός των δεδομένων – Ομογενοποίηση κλειδιών ID Descr 10 Coca 20 Pepsi Descr 10

Καθαρισμός των δεδομένων – Ομογενοποίηση κλειδιών ID Descr 10 Coca 20 Pepsi Descr 10 Pepsi 20 HBH Descr 100 Coca 110 Pepsi R 1 ID ID DW. R + + R 2 Lookup 120 HBH Source Sour Surrogate ID ce Key 10 R 1 100 20 R 1 110 10 R 2 110 20 R 2 120 70

DW = Αποθηκευμένες όψεις? PKEY, DAY MIN(COST) DW. PARTSU PP S 1_PARTSU PP U

DW = Αποθηκευμένες όψεις? PKEY, DAY MIN(COST) DW. PARTSU PP S 1_PARTSU PP U S 2_PARTSU PP Sources Aggregate 1 DW. PARTSUPP. DATE, DAY TIME V 1 PKEY, MONTH AVG(COST) Aggregate 2 DW 76 V 2

DW ≠ Αποθηκευμένες όψεις! DS. PS_NEW 1. PKEY, DS. PS_OLD 1. PKEY DIFF 1

DW ≠ Αποθηκευμένες όψεις! DS. PS_NEW 1. PKEY, DS. PS_OLD 1. PKEY DIFF 1 SUPPKEY=1 DS. PS 1 Add_SPK 1 DS. PS 1. PKEY, LOOKUP_PS. SKEY, SUPPKEY SK 1 DS. PS_NEW 2. PKEY, DS. PS_OLD 2. PKEY DIFF 2 SUPPKEY=2 DS. PS 2 Add_SPK 2 A 2 EDate rejected Log DS. PS 2. PKEY, LOOKUP_PS. SKEY, SUPPKEY COST DATE=SYSDATE Not. NULL SK 2 Log U rejected Log rejected DS. PS_OLD 2 DATE $2€ rejected DS. PS_OLD 1 DS. PS_NEW 2 COST Add. Date QTY>0 Check. QTY rejected Log DSA PKEY, DAY MIN(COST) S 1_PARTSU PP FTP 1 DW. PARTSU PP Aggregate 1 DW. PARTSUPP. DATE, DAY S 2_PARTSU PP Sources FTP 2 TIME V 1 PKEY, MONTH AVG(COST) Aggregate 2 DW 77 V 2

Συστήματα Επερώτησης - Data Warehouse Front-End Reporting / OLAP tools Quality Issues Metadata Repository

Συστήματα Επερώτησης - Data Warehouse Front-End Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data Marts DW Quality Issues Administrator End User Administrator Designer 78

Εργαλεία για υποστήριξη αποφάσεων Ad hoc ερωτήσεις και αναφορές Π. χ. , : MS

Εργαλεία για υποστήριξη αποφάσεων Ad hoc ερωτήσεις και αναφορές Π. χ. , : MS Excel, Oracle Forms, … OLAP pivot tables, drill down, roll up, slice, dice Data Mining 79

R EG IO N Κύβοι για OLAP PRODUCT N S Juice 10 Cola 13

R EG IO N Κύβοι για OLAP PRODUCT N S Juice 10 Cola 13 W Soap Jan MONTH 81

Κύβοι & ιεραρχίες διαστάσεων για OLAP Sales volume gio n Διαστάσεις: Product, Region, Date

Κύβοι & ιεραρχίες διαστάσεων για OLAP Sales volume gio n Διαστάσεις: Product, Region, Date Product Re Ιεραρχίες διαστάσεων: Industry Country Year Category Region Quarter Month Product City Store Week Month Day 86

Roll up Industry Country Year Category Region Quarter Product City Sales volume Electronics Toys

Roll up Industry Country Year Category Region Quarter Product City Sales volume Electronics Toys Clothing Cosmetics Store Day Store 1 Store 2 $5, 2 $1, 9 $2, 3 $1, 1 $8, 9 $0, 75 $4, 6 $1, 5 $5, 6 $1, 4 $2, 6 $1, 1 $7, 2 $0, 4 $4, 6 $0, 5 Χρόνος: Επίπεδο Quarter Sales volume Products Year 1996 Q 2 Q 1 Products Month Week Electronics Toys Clothing Cosmetics Store 1 Store 2 $14, 1 $2, 65 $6, 9 $2, 6 $12, 8 $1, 8 $7, 2 $1, 6 Χρόνος: Επίπεδο Year SUM(Sales volumes) 90

Drill down Industry Country Year Category Region Quarter Product City Store Electronics Store 1

Drill down Industry Country Year Category Region Quarter Product City Store Electronics Store 1 Store 2 $5, 2 $1, 9 $2, 3 $1, 1 $8, 9 $0, 75 $4, 6 $1, 5 VCR Camcorder TV CD player $5, 6 $1, 4 $2, 6 $1, 1 $7, 2 $0, 4 $4, 6 $0, 5 Item: Επίπεδο Industry Q 1 Store 2 Q 1 Q 2 Electronics Toys Clothing Cosmetics Day Sales volume Products Month Week $1, 4 $0, 6 $2, 0 $1, 2 $2, 4 $3, 3 $2, 2 $1, 0 $1, 4 $0, 6 $2, 4 $1, 2 $2, 4 $1, 3 $2, 5 $1, 0 Item: Επίπεδο Category 92

Pivot Sales volume $5, 2 $1, 9 $2, 3 $1, 1 $8, 9 $0,

Pivot Sales volume $5, 2 $1, 9 $2, 3 $1, 1 $8, 9 $0, 75 $4, 6 $1, 5 $5, 6 $1, 4 $2, 6 $1, 1 $7, 2 $0, 4 $4, 6 $0, 5 Store 1 Electronics Toys Clothing Cosmetics Store 1 Store 2 Q 2 Q 1 Products Q 1 Q 2 Electronics Toys Clothing Cosmetics $5, 2 $1, 9 $2, 3 $1, 1 $5, 6 $1, 4 $2, 6 $1, 1 $8, 9 $0, 75 $4, 6 $1, 5 $7, 2 $0, 4 $4, 6 $0, 5 Εναλλαγή γραμμών και στηλών 94

Slice and Dice Sales volume $5, 2 $1, 9 $2, 3 $1, 1 $8,

Slice and Dice Sales volume $5, 2 $1, 9 $2, 3 $1, 1 $8, 9 $0, 75 $4, 6 $1, 5 $5, 6 $1, 4 $2, 6 $1, 1 $7, 2 $0, 4 $4, 6 $0, 5 Products Store 1 Q 1 Electronics Toys Clothing Cosmetics Sales volume Store 1 Store 2 Electronics Toys $5, 2 $1, 9 Q 2 Q 1 Products Electronics Toys $8, 9 $0, 75 Διώξε το Store 2 και τις βιομηχανίες Clothing & Cosmetics 96

Τελεστές Rollup και Cube select color, make, year, sum(units) from car_sales where make in

Τελεστές Rollup και Cube select color, make, year, sum(units) from car_sales where make in {“chevy”, “ford”} and year between 1990 and 1994 group by cube color, make, year having sum(units) > 0; Aggregate Group By (with total) Sum By Color RED WHITE BLUE Cross Tab Chevy Ford Sum By Color RED WHITE BLUE CH By Make FO EV Sum Y By Year RD 0 1991991 2 199 993 1 By Make & Year RED WHITE BLUE Jim Gray Adam Bosworth Andrew Layman Microsoft Hamid Pirahesh IBM By Color & Year Sum By Make & Color By Color 98

Top N Queries SELECT P. pid, P. pname, S. sales FROM Sales S, Products

Top N Queries SELECT P. pid, P. pname, S. sales FROM Sales S, Products P WHERE S. pid=P. pid AND S. locid=1 AND ORDER BY S. sales DESC OPTIMIZE FOR 10 ROWS SELECT P. pid, P. pname, S. sales FROM Sales S, Products P WHERE S. pid=P. pid AND S. locid=1 AND S. sales > c ORDER BY S. sales DESC S. timeid=3

Online Aggregation 103

Online Aggregation 103

Ευρετήρια για OLAP ερωτήσεις Rating = 1 2 3 4 5 sex F M

Ευρετήρια για OLAP ερωτήσεις Rating = 1 2 3 4 5 sex F M custid name sex rating

Ευρετήρια για OLAP ερωτήσεις TIME timeid date week month pid timeid quarter locid year

Ευρετήρια για OLAP ερωτήσεις TIME timeid date week month pid timeid quarter locid year holiday_flag sales SALES (Fact table) PRODUCTS pid pname category price locid city LOCATIONS state country Έστω το εν λόγω σχήμα με τρεις πίνακες διαστάσεων και ένα fact table 107

Λεξικό Μεταπληροφορίας – Metadata Repository Reporting / OLAP tools Quality Issues Metadata Repository ETL

Λεξικό Μεταπληροφορίας – Metadata Repository Reporting / OLAP tools Quality Issues Metadata Repository ETL Sources DSA Data Marts DW Quality Issues Administrator End User Administrator Designer 111

The DWQ Approach 114

The DWQ Approach 114

Στάνταρτ για τη μεταπληροφορία Metadata Coalition Meta. Data Interchange Specification (MDIS) Open Information Model

Στάνταρτ για τη μεταπληροφορία Metadata Coalition Meta. Data Interchange Specification (MDIS) Open Information Model (OIM) OMG Common Warehouse Model (CWM) Microsoft Repository 115

Το χρήμα. . . Estimated sales in millions of dollars [Sh. Ty 98] (*estimates

Το χρήμα. . . Estimated sales in millions of dollars [Sh. Ty 98] (*estimates are from [Pend 00]). 117