40166 Oracle Database 10 g SQL Model Clause
40166 Oracle Database 10 g SQL Model Clause Andy Witkowski, Architect Thomas Kyte, VP Oracle Corporation
What’s now in SQL for Modeling Ÿ Aggregation Enhancements – Cube, Rollup, Grouping Sets – New aggregates: Inverse Distribution, FIRST/LAST, etc Ÿ Analytic Functions – Window Functions: Rank, Moving, Cumulative – Statistical Functions: Correlation, Linear Regression, etc Ÿ Old tools still have more modeling power than SQL Model enhances SQL with modeling power – Spreadsheets, MOLAP engines
Case Study – Modeling with Excel Ÿ Excel fits well at the personal scale – – – UI and Formatting Calculations (build-in functions, formulas) What-If analysis Ÿ Excel fits poorly at corporate scale for modeling Cryptic row-column addressing – No metadata, No standards, No mathematical model – 100 s of spreadsheets and consolidation by hand – Does not scale (1000’s formulas, TB of data) Replace Excel Modeling with SQL Modeling – Perpetual data exchange: databases->Excel –
Modeling with SQL Model Ÿ Language: Spreadsheet-like calculations in SQL – – – – Inter-row calculation. Treats relations as an N-Dim array Symbolic references to cells and their ranges Multiple Formulas over N-Dim arrays Automatic Formula Ordering Recursive Model Solving Model is a relation & can be processed further in SQL Multiple arrays with different dimensionality in one query Ÿ Performance – – – Parallel Processing in partitioning & formulas Multiple-self joins with one data access structure Multiple UNIONs with one data access structure Ÿ Why Better? – – – Automatic Consolidation (models as views – combine using SQL) Self Adjusting (as database changes no need to re-define) One version of truth (calc directly over data base, no exchange)
SQL Model Concepts
prod time vcr dvd 2001 Define Relation as Array s 9 0 Relation time Array SELECT prod, time, s FROM sales 1 2 3 4 1999 5 6 7 8 2000 9 0 1 2 2001 dvd tv pc vcr prod DIMENSION BY (prod, time) MEASURES (s)
prod time vcr dvd 2001 Define Business Rules s 9 0 Relation time Array SELECT prod, time, s FROM sales 1 2 3 4 1999 5 6 7 8 2000 9 0 1 2 2001 dvd tv pc vcr DIMENSION BY (prod, time) MEASURES (s) RULES UPSERT ( Sales in 2000 2 x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict vcr sales in 2002 s[dvd, 2002] =AVG(s) [CV(prod), time<2001] Predict dvd sales in 2002 ) prod
prod time vcr dvd 2001 Evaluate Formulas – 1 st s 9 0 Relation Array SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 dvd tv pc vcr DIMENSION BY (prod, time) MEASURES (s) RULES UPSERT ( Sales in 2000 2 x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict vcr sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] Predict dvd sales in 2002 )
prod time vcr dvd 2001 Evaluate Formulas – 2 nd s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 DIMENSION BY (prod, time) MEASURES (s) 2002 11 RULES UPSERT ( Sales in 2000 2 x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict vcr sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] Predict dvd sales in 2002 ) vcr dvd tv pc
prod time vcr dvd 2001 Evaluate Formulas – 3 rd s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 11 3 DIMENSION BY (prod, time) MEASURES (s) 2002 RULES UPSERT ( Sales in 2000 2 x of previous year s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], Predict vcr sales in 2002 s[dvd, 2002] = AVG(s) [CV(prod), time<2001] Predict dvd sales in 2002 ) vcr dvd tv pc
prod time vcr dvd 2001 Return as Relation s 9 0 Relation SELECT prod, time, s FROM sales 1 2 3 4 1999 2 4 6 8 2000 9 0 1 2 2001 2002 11 3 vcr dvd DIMENSION BY (prod, time) MEASURES (s) tv pc Self-join + UNION RULES UPSERT ( s[ANY, 2000] = s[CV(prod), CV(time) - 1] * 2, s[vcr, 2002] = s[vcr, 2001] + s[vcr, 2000], s[dvd, 2002] = AVG(s) [CV(prod), time<2001] ) Relation again vcr dvd 2001 9 0 vcr dvd 2002 11 3 Rows updated & inserted by the Model clause
Model Clause – Components Model clause Partitioning SELECT region, prod, time, s Dims of array FROM sales GROUP BY region, prod, time MODEL PARTITION BY (region) DIMENSION BY (prod, time) MEASURES (sum(sales) s, count(sales) c) RULES ITERATE ( ) UNTIL ( ) Model options ( s[ANY, 2000] = s[CV(prod), CV(time) - 1], s[dvd, 2003] = s[dvd, 2002] + s[dvd, 2001], Formulas UPSERT s[vcr, 2003] = AVG(s) [vcr, time < 2001] ) ORDER BY region, product, time, s; Formula Options
Key Concepts (1) Ÿ New SQL Model Clause: – – – Data as N-dim arrays with DIMENSIONS & MEASURES Data can be PARTITION-ed - creates an array per partition Formulas defined over the arrays express a (business) model Ÿ Formulas within a Model: – – – Use symbolic addressing using familiar array notation Can be ordered automatically based on dependency between cells Can be recursive with a convergence condition – recursive models Can UPDATE or UPSERT cells Support most SQL functions including aggregates
Key Concepts (2) Ÿ Result of a SQL Model is a relation – – Can participate further in processing via joins, etc. Can define views containing Model computations Ÿ SQL Model is the last query clause – – Executed after joins, aggregation, window functions Before ORDER BY Ÿ Main Model and Reference Models – Can relate models of different dimensionality
Formula Fundamentals (1) Ÿ Formulas: SQL expressions over cells with aggs, functions, etc. Ÿ Formula has a left and right side and represents assignment – – s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000] s[‘vcr’, 2002] = AVG(s)[‘vcr’, t<2002] – single ref – multi ref on right Ÿ Left side can qualify multiple cells – – – s[p IN (‘vcr’, ’dvd’), t<2002] = 1000 – multi ref on left s[ANY, t=2002] = 2 * s[CV(p), CV(t)-1] – left-right correlation s[p IN (SELECT prod FROM prod_tb), 2000] = 1000 Ÿ Formula can operate in update or upsert mode – – update s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000] upsert s[‘vcr’, 2002] = s[‘vcr’, 2001] + s[‘vcr’, 2000]
Formula Fundamentals (2) Ÿ Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t)-1] +s[CV(p), CV(t) – 2]
Formula Fundamentals (2) Ÿ Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Ÿ Formula result can depend on processing order. Can specify order in each formula. E. g. , shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1]
Formula Fundamentals (2) Ÿ Function CV(dimension) propagates values from left to the right side. In example, products in 2002 are sum of two previous years. s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Ÿ Formula result can depend on processing order. Can specify order in each formula. E. g. , shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr 2001 2002 2003 2004 2005 300. 00 350. 00 400. 00 450. 00 500. 00 ORDER BY t 0 300. 00
Formula Fundamentals (2) Ÿ Function CV(dimension) propagates values from left to the right side. E. g, products in 2002 are sum of two previous years s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Ÿ Formula result can depend on processing order. Can specify order in each formula. E. g. , shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr 2001 2002 2003 2004 2005 300. 00 350. 00 400. 00 450. 00 500. 00 ORDER BY t 0 300. 00 350. 00
Formula Fundamentals (2) Ÿ Function CV(dimension) propagates values from left to the right side. E. g, products in 2002 are sum of two previous years s[ANY, 2002] = s[CV(p), CV(t) -1] + s[CV(p), CV(t) – 2] Ÿ Formula result can depend on processing order. Can specify order in each formula. E. g. , shift by time: s[‘vcr’, ANY] ORDER BY t = s[‘vcr’, CV(t) - 1] vcr vcr vcr 2001 2002 2003 2004 2005 300. 00 350. 00 400. 00 450. 00 500. 00 ORDER BY t 0 300. 00 350. 00 400. 00 450. 00
Model Options – Fundamentals global options MODEL [ UNIQUE DIMENSIONS | UNIQUE SINGLE REFERENCE ] [ IGNORE NAV | KEEP NAV ] PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES [ UPDATE | UPSERT ] rule [ AUTOMATIC ORDER | SEQUENTIAL ORDER ] ( s[ANY, 2002] = 1. 2 * s[CV(product), 2002], s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2000], s[‘video’, 2003] = s[‘dvd’, 2003], s[‘vcr’, 2003] ) options
NAV Options: Handling Sparse Data West dvd 2001 300. 00 West tv vcr 2002 2001 2002 500. 00 200. 00 400. 00 MODEL KEEP NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ , 2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) keep nav 2001 West West dvd tv vcr 2001 2002 2003 2001 2002 300. 00 500. 00 200. 00 400. 00 ?
NAV Options: Handling Sparse Data West dvd 2001 300. 00 West tv vcr 2002 2001 2002 500. 00 200. 00 400. 00 MODEL INGNORE NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ , 2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) ignore nav assume 0 West West dvd tv vcr 2001 2002 2003 2001 2002 300. 00 500. 00 200. 00 400. 00
NAV Options: Handling Sparse Data West dvd 2001 300. 00 West tv vcr 2002 2001 2002 500. 00 200. 00 400. 00 MODEL KEEP NAV PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPSERT ( s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] s[‘tv’ , 2003] = sum(s) [‘tv’, t BETWEEN 2001 AND 2002] ) keep nav ignore nav 2001 West West dvd tv vcr 2001 2002 2003 2001 2002 300. 00 500. 00 200. 00 400. 00 assume 0 ? West West dvd tv vcr 2001 2002 2003 2001 2002 300. 00 500. 00 200. 00 400. 00
Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F 1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F 2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F 3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] )
Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F 1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F 2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F 3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] ) F 1 depends on F 3 and F 2 depends on F 3, thus F 3 automatically first:
Automatic Formula Ordering MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE AUTOMATIC ORDER ( F 1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2002], F 2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F 3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] ) F 1 depends on F 3 and F 2 depends on F 3, thus F 3 automatically first: RULES UPDATE AUTOMATIC ORDER ( F 3: s[ANY, 2002] = s[CV(p), 2001] + s[CV(p), 1999] F 2: s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001], F 1: s[‘vcr’, 2003] = AVG(s) [‘vcr’, t BETWEEN 1995 AND 2000] )
UPDATE, UPSERT & Partitions Region East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1. 2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] )
UPDATE, UPSERT & Partitions Region East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1. 2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] ) Region East Product Time dvd 2001 dvd 2002 vcr 2002 Old s 100 150 100 New s 100 120 West dvd 200 2001 updated
UPDATE, UPSERT & Partitions Region East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2001 s 100 150 100 200 MODEL PARTITION BY (r) DIMENSION BY (p, t) MEASURES (s) RULES UPDATE IGNORE NAV ( UPDATE s[‘vcr’, 2002] = s[‘vcr’, 2002] * 1. 2, UPSERT s[‘dvd’, 2003] = s[‘dvd’, 2002] + s[‘dvd’, 2001] ) Region East West Product Time dvd 2001 dvd 2002 vcr 2002 dvd 2003 dvd 2001 dvd 2003 Old s 100 150 100 200 - New s 100 120 250 200 updated upserted
Different dimensions: Reference Relate Models with different dimensions. Represent each as ndimensional array: one main, others as reference or lookup arrays. Sales Table c p USA dvd USA tv Poland vcr France vcr t 2001 s 300. 00 $ 500. 00 $ 200. 00 zl 100. 00 fr Conv table converts currency to $ c USA Poland France ratio 1 0. 24 0. 12 SELECT c, p, t, s FROM sales MODEL REFERENCE convert ON (SELECT c, ratio FROM conv) DBY (c) MEASURES(r) MAIN DIMENSION BY (c, p, t) MEASURES (s) RULES UPSERT ( s[ANY, ANY] = r[CV(c)] * s[CV(c), CV(p), CV(t)] )
Different dimensions: Reference Sales Table c p USA dvd USA tv Poland vcr France vcr t 2001 Conv table converts currency to $ c ratio USA 1 Poland 0. 24 France 0. 12 s 300. 00 $ 500. 00 $ 200. 00 zl 100. 00 fr SELECT c, p, t, s FROM sales MODEL REFERENCE convert ON (SELECT c, ratio FROM conv) DBY (c) MEASURES(r) MAIN DIMENSION BY (c, p, t) MEASURES (s) RULES UPSERT ( s[ANY, ANY] = r[CV(c)] * s[CV(c), CV(p), CV(t)] ) USA dvd USA tv Poland vcr France vcr 2001 300. 00 $ 500. 00 $ 48. 00 $ 12. 00 $ Converted values
Recursive Model Solving Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4
Recursive Model Solving Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4
Recursive Model Solving Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4
Recursive Model Solving Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (8) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 128 4 64 5 32 6 16 7 8 8 4
Recursive Model Solving with Until Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 previous(s[1]) - s[1] = 512 4 128 5 64 6 32 7 16 8 8 9 4 10 2
Recursive Model Solving with Until Ÿ Model can contain cyclic (recursive) formulas. - If cyclic formulas desired, use ITERATE option - If ITERATE not present, cyclic formulas automatically detected, and an error reported. Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL clause to specify convergence conditions SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 previous(s[1]) - s[1] = 256 4 128 5 64 6 32 7 16 8 8 9 4 10 2
Recursive Model Solving with Until Ÿ Model can contain cyclic (recursive) formulas. They are automatically detected, and error is reported. Unless cycles are intentional which is indicated with ITERATE option Ÿ Use ITERATE clause to specify # of iterations or Ÿ Use UNTIL to specify convergence conditions. Stop if true. SELECT x, s FROM dual MODEL DIMENSION BY (1 x) MEASURES (1024 s) RULES ITERATE (10000) UNTIL (PREVIOUS(s[1]) – s[1] <= 1) ( s[1] = s[1] / 2 ) Iteration 1 2 S value 1024 512 3 256 4 128 5 64 6 32 7 16 8 8 9 4 10 2 previous(s[1]) - s[1] = 4
SQL Model Business Examples
Time Series Calculation (1) Compute the ratio of current month sales of each product to sales one year ago, one quarter ago and one month ago. Assume: Sales cube with product sales per year, quarter, and month & a time table mapping periods to prior year, quarter and month time table: maps t to y_ago, q_ago, m_ago Sales cube: prod sales per y, q, m t y_ago q_ago m_ago t product sales 1999 m 01 1998 m 10 1998 m 12 1999 m 01 vcr 100. 00 1999 m 02 1998 m 11 1999 m 02 vcr 120. 00 … … … … 1999 q 01 1998 -q 04 NULL 1999 q 01 vcr 360. 00 … … … …
Time Series Calculation (2) • Reference model with Time table acts like look-up table • CV carries values from the left side to the right side • Without Model, you need 3 outer joins and a regular join SELECT product, sales, r_y_ago, r_q_ago, r_m_ago FROM sales_cube MODEL REFERENCE r ON (SELECT * from time) DIMENSION BY (t) MEASURES (y_ago, q_ago, m_ago) MAIN PARTITION BY (product) DIMENSION BY (t) MEASURES (sales, 0 r_y_ago, 0 r_q_ago, 0 r_m_ago) RULES ( r_y_ago[ANY] = s[CV(t)] / s[ y_ago[CV(t)] ], -- year ago r_q_ago[ANY] = s[CV(t)] / s[ q_ago[CV(t)] ], -- quarter ago r_m_ago[ANY] = s[CV(t)] / s[ m_ago[CV(t)] ] -- month ago );
Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999 m 01 vcr 100. 00 0. 050 0. 280 0. 830 1999 m 02 vcr 120. 00 … … … 1999 q 01 vcr 360. 00 1998 q 04 vcr 370. 00 … … …
Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999 m 01 vcr 100. 00 0. 050 0. 280 0. 830 1999 m 02 vcr 120. 00 0. 055 0. 330 … … 1999 q 01 vcr 360. 00 0. 160 0. 970 null 1998 q 04 vcr 370. 00 … … …
Time Series Calculation (3) Compute the ratio of current period sales of each product to sales a year ago, quarter ago and a month ago. For each row, we use the reference Model to find 3 other rows. Sales cube: prod sales per y, q, m t product sales r_y_ago r_q_ago a_m_ago 1999 m 01 vcr 100. 00 0. 050 0. 280 0. 830 1999 m 02 vcr 120. 00 0. 055 0. 330 . . . … … … 1999 q 01 vcr 360. 00 0. 160 0. 970 null 1998 q 04 vcr 370. 00 … … null … … …
Recursive Model Solving: Ledger (1) In my ledger, I have accounts: Net income, Interest, Taxes, etc. • I want to have 30 % of my Net income as Interest (F 1) • My Net income is Salary minus Interest, minus Tax (F 2) • Taxes are 38% of Gross (salary–interest) and 28% of Capital_gain (F 3) SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION (account) MEASURES (balance b) RULES ITERATE (. . ) UNTIL. . ( b[‘interest’] = b[‘net’] * 0. 30, --F 1 b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], --F 2 b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0. 38 + b[‘capital_gain’] *0. 28 --F 3 )
Recursive Model Solving: Ledger (1) In my ledger, I have accounts: Net income, Interest, Taxes, etc. • I want to have 30 % of my Net income as Interest (F 1) • My Net income is Salary minus Interest, minus Tax (F 2) • Taxes are 38% of Gross (salary–interest) and 28% of Capital_gain (F 3) SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION (account) MEASURES (balance b) RULES ITERATE (. . ) UNTIL. . ( b[‘interest’] = b[‘net’] * 0. 30, --F 1 b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], --F 2 b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0. 38 + b[‘capital_gain’] *0. 28 --F 3 ) net F 1 interest F 2 tax F 3 two cycles in the formulas
Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What are my Net income, Interest expense & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0. 01) ( b[‘interest’] = b[‘net’] * 0. 30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0. 38 + b[‘capital_gain’] *0. 28 ) Input Ledger Account salary capital_gains net tax interest Balance 100, 000 15, 000 0 Output Iterate till accuracy of. 01 after 1 st iteration Account salary capital_gains net tax interest Balance 100, 000 15, 000 100, 000 42, 220 30, 000
Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What is my Net & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0. 01) ( b[‘interest’] = b[‘net’] * 0. 30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0. 38 + b[‘capital_gain’] *0. 28 ) Input Ledger Account salary capital_gains net tax interest Balance 100, 000 15, 000 0 Output Iterate till accuracy of. 01 after 2 nd iteration Account salary capital_gains net tax interest Balance 100, 000 15, 000 27, 800 30, 800 8, 340
Recursive Model Solving: Ledger (2) In my ledger, I know Salary & Capital_gains. What is my Net & Taxes? SELECT account, b FROM ledger MODEL IGNORE NAV DIMENSION BY (account) MEASURES (balance b) RULES ITERATE (10000) UNTIL (ABS(b[‘net’] - PREVIOUS(b[‘net’])) < 0. 01) ( b[‘interest’] = b[‘net’] * 0. 30, b[‘net’] = b[‘salary’] – b[‘interest’] – b[‘tax’], b[‘tax’] = (b[‘salary’] – b[‘interest’]) * 0. 38 + b[‘capital_gain’] *0. 28 ) Input Ledger Account salary capital_gains net tax interest Balance 100, 000 15, 000 0 Output Iterate till accuracy of. 01 after reaching accuracy (26 iterations) Account salary capital_gains net tax interest Balance 100, 000 15, 000 48, 735 36, 644 14, 620
Financial Functions: NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 2001 2 vcr 10. 00 2002 3 vcr 20. 00 1999 0 dvd -200. 00 2000 1 dvd 22. 00 2001 2 dvd 12. 00 2002 3 dvd 14. 00
Financial Functions: NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 2001 2 vcr 10. 00 2002 3 vcr 20. 00 1999 0 dvd -200. 00 2000 1 dvd 22. 00 2001 2 dvd 12. 00 2002 3 dvd 14. 00 amount[1]/power(1+rate, 1) + npv[1 -1]
Financial Functions: NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 2001 2 vcr 10. 00 2002 3 vcr 20. 00 1999 0 dvd -200. 00 2000 1 dvd 22. 00 2001 2 dvd 12. 00 2002 3 dvd 14. 00 amount[2]/power(1+rate, 2) + npv[2 -1]
Financial Functions: NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 2001 2 vcr 10. 00 2002 3 vcr 20. 00 1999 0 dvd -200. 00 2000 1 dvd 22. 00 2001 2 dvd 12. 00 2002 3 dvd 14. 00 amount[3]/power(1+rate, 3) + npv[3 -1]
Financial Functions: NPV – net present value of a series of periodic cash flows. Cash_Flow table year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 2001 2 vcr 10. 00 2002 3 vcr 20. 00 1999 0 dvd -200. 00 2000 1 dvd 22. 00 2001 2 dvd 12. 00 2002 3 dvd 14. 00 amount[i]/power(1+rate, i) + npv[i-1] npv[ANY] ORDER BY i = amount[ CV(i) ] / power(1+rate, CV(i)) + npv[CV(i) – 1]
Financial Functions: NPV (2) NPV – Net present value of a series of periodic cash flows. Cash_Flow table and npv for rate = 0. 14 year i prod amount npv 1999 0 vcr -100. 00 2000 1 vcr 12. 00 -89. 47 2001 2 vcr 10. 00 -81. 78 2002 3 vcr 20. 00 -68. 28 1999 0 dvd -200. 00 2000 1 dvd 22. 00 180. 70 2001 2 dvd 12. 00 171. 47 SELECT year, i, prod, amount, npv FROM cash_flow MODEL PARTITION BY (prod) DIMENSION BY (i) MEASURES (amount, 0 npv, year) RULES ( npv[ 0] = amount[0], npv[i !=0] ORDER BY i = amount[ CV() ] / POWER(1. 14, CV() ) + npv[CV() - 1] )
SQL Model Performance
SQL Model – Time Series Earlier example: ratio of sales to year, quarter and month ago SELECT product, sales, r_y_ago, r_q_ago, r_m_ago FROM sales_cube MODEL REFERENCE r ON (SELECT * from time) DIMENSION BY (t) MEASURES (y_ago, q_ago, m_ago) MAIN PARTITION BY (product) DIMENSION BY (t) MEASURES (sales, 0 r_y_ago, 0 r_q_ago, 0 r_m_ago) RULES ( r_y_ago[ANY] = s[CV(t)] / s[ y_ago[CV(t)] ], -- year ago r_q_ago[ANY] = s[CV(t)] / s[ q_ago[CV(t)] ], -- quarter ago r_m_ago[ANY] = s[CV(t)] / s[ m_ago[CV(t)] ] -- month ago ); • ANSI SQL version needs outer join for each formula plus a join for reference model. • N formulas, M reference models N+M joins 4 joins in this example: sales_cube time sales_cube
SQL Model vs. ANSI Joins 400 Query response time ANSI joins 350 300 250 200 SQL Model 150 100 50 Number of rules or joins 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Summary Ÿ New facility for spreadsheet-like computations in SQL Ÿ High Performance – – – Replaces multiple joins, unions Scalable in size and parallel processing Powerful optimizations Ÿ Collaborative analysis Ÿ Move external processing such as spreadsheets into RDBMs for manageability and consolidation
Next Steps…. Ÿ Demonstration at Oracle DEMOgrounds – – Exhibit hall, Booth 1326, Database Area Monday: 5: 00 PM - 8: 00, Tuesday: 10: 30 - 1: 00, 3: 00 6: 00, Wednesday: 11: 00 - 4: 30, Thursday: 10: 30 - 2: 00 Ÿ Hands-on Lab – – – Marriott Hotel - Golden Gate B 1 Lab Section: Use Information from your Data Warehouse Lesson 1: Using the SQL Model clause Monday: 10: 30 - 5: 00, Tuesday: 8: 30 - 12: 30, 3: 00 5: 00, Wednesday: 8: 30 - 4: 30, Thursday: 8: 30 - 2: 30
Reminder – please complete the Oracle. World online session survey Thank you.
Q& A Q U E S T I O N S A N S W E R S
- Slides: 65