Explicit and Implicit LIST Aggregate Function for Relational

  • Slides: 27
Download presentation
Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9

Explicit and Implicit LIST Aggregate Function for Relational Databases Witold Litwin Université Paris 9 Dauphine mailto: Witold. litwin@dauphine. fr 1

Summary • New Aggregate Function • Transforms a set of values into single one

Summary • New Aggregate Function • Transforms a set of values into single one – Char type • A basic long time need • Should be highly useful 2

Plan • • • Motivating Examples Explicit LIST Implicit LIST Conlusion Further Work 3

Plan • • • Motivating Examples Explicit LIST Implicit LIST Conlusion Further Work 3

Motivating Example 1 • The Supplier-Part (SP) table of the bestknown S-P database 4

Motivating Example 1 • The Supplier-Part (SP) table of the bestknown S-P database 4

Motivating Example 1 • The classical query : select SP. [S#], Sum(SP. Qty) AS

Motivating Example 1 • The classical query : select SP. [S#], Sum(SP. Qty) AS [Total Qty] from SP group By SP. [S#]; S# S 1 S 2 S 3 S 4 Total Qty 1300 700 200 900 • How to get also the individual quantities ? 5

Motivating Example 2 • A database of persons having: – Multiple Hobbies – Multiple

Motivating Example 2 • A database of persons having: – Multiple Hobbies – Multiple preferred Restaurants – Many Friends • Best design: – four 4 -NF tables • P (SS #, Name), H (SS#, Hobby), R (SS#, Rest), F (SS#, Friend) 6

Database 7

Database 7

Fragment 8

Fragment 8

Query Select Name, Friends, Restaurants, Hobbies, of Person ‘SS 1’ SQL : select P.

Query Select Name, Friends, Restaurants, Hobbies, of Person ‘SS 1’ SQL : select P. [SS#], P. Name, F. Friend, R. Rest, H. Hobby from ((P INNER JOIN F ON P. [SS#] = F. [SS#]) INNER JOIN H ON P. [SS#] = H. [SS#]) INNER JOIN R ON P. [SS#] = R. [SS#] where P. [SS#] ="ss 1" ; 9

Result Usable ? ? ? 10

Result Usable ? ? ? 10

General Problem • Current RDBs manage tables in 1 NF – All attributes are

General Problem • Current RDBs manage tables in 1 NF – All attributes are single-valued (atomic values) • Example 1 ; We wished – Single-valued attribute : • SUM(QTY) – Multi-valued attribute • Individual quantities • The result would not be 1 NF 11

General Problem • RDB manages tables in 1 NF – All attributes are single-valued

General Problem • RDB manages tables in 1 NF – All attributes are single-valued • Example 2 ; We wished : – Single-valued attributes : • S#, Name – Multi-valued attributes (multi-sets): • Hobby, Rest, Friend • The result is normalized to 1 NF – {(ss 1, Witold, x, y, z) : x Hobby, y Rest, z Friend } – The table is not in 4 NF – Subject to well-known anomalies 12

Solutions • Design RDBS for 0 NF tables – A revolution – 0 NF

Solutions • Design RDBS for 0 NF tables – A revolution – 0 NF RDBS will not be here for years • Aggregate set or multi-set values into atomic values – An evolution – All RDBS already do it using: • SUM, AVG, COUNT… • perhaps with GROUP BY – We need a new aggregate leaving the entire set visible • E. g: (multi)-set of values X => (single) list of values X 13

Local Culinary Example • The set-valued attribute: – (Schwarz, Wälder; Kirchen, Chocoladen, Torte) •

Local Culinary Example • The set-valued attribute: – (Schwarz, Wälder; Kirchen, Chocoladen, Torte) • The aggregated attribute: – Schwarzwälderkirchenchocoladentorte • Local specialty, try it ! 14

Explicit LIST function Select S#, sum (Qty) AS [Total Qty], LIST (Qty) AS Histogram

Explicit LIST function Select S#, sum (Qty) AS [Total Qty], LIST (Qty) AS Histogram from SP group by S#; 15

Explicit LIST function select P. SS#, Name, LIST (DISTINCT (Friend)), LIST (DISTINCT (Rest)), LIST

Explicit LIST function select P. SS#, Name, LIST (DISTINCT (Friend)), LIST (DISTINCT (Rest)), LIST (DISTINCT (Hobby)) from P, F, R, H where P. SS# = F. SS# and F. SS# = R. SS# and R. SS# = H. SS# and P. SS# ="ss 1" group by P. SS#, Name ; P Name Friend Rest SS 1 Witold Alexis, Christopher, Ron, Jim, Donna, Elisabeth, Dave, Peter, Per-Ake, Thomas Bengal, Cantine Paris 9, Chef Wu, Ferme de Condé, Miyake, Louis XIII, Mela, North Beach Pizza, Pizza Napoli, Sushi Etoile Hobby Bike, Classical Music, Good food, Hike, Movie, Science Fiction, Ski, Swim, Tennis, Wine 16

Explicit LIST function • Simulated actual output using Ms. Access forms with list boxes

Explicit LIST function • Simulated actual output using Ms. Access forms with list boxes • Form with three subforms • No SQL query used 17

Explicit LIST function select P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as

Explicit LIST function select P#, SUM (Qty) as [Total Qty], LIST (S#, Qty) as [Per supplier] from SP group by P#; P# Total Qty Per supplier p 1 600 s 1 300 s 2 300 p 2 1000 s 1 200 s 2 400 s 3 200 s 4 200 p 3 400 s 1 400 p 4 500 s 1 200 s 4 300 p 5 500 s 1 100 s 4 400 p 6 100 s 1 100 18

Implicit LIST function • For any single-valued A : – A = LIST (A)

Implicit LIST function • For any single-valued A : – A = LIST (A) • Any non-aggregated attribute in an SQL query has to be in the GROUP BY clause • Now, any non-aggregated perhaps composite attribute A from a single table and not in GROUP BY clause is implicitly under – LIST (DISTINCT (A)) • Queries may become less procedural 19

Implicit LIST function select P#, SUM (Qty) as [Total Qty], S#, Qty from SP

Implicit LIST function select P#, SUM (Qty) as [Total Qty], S#, Qty from SP group by P# having ‘S# QTY’ like ‘*s 4*’; • Implicit LIST is LIST (S#, QTY) 20

Implicit LIST function • Query Select S. *, P#, Qty From S, SP Where

Implicit LIST function • Query Select S. *, P#, Qty From S, SP Where S. S# = SP. S# Repeats all the data of the supplier S in every resulting tuple – 6 times for S 1: its Name, City, Status • Query Select S. *, P#, Qty From S, SP Where S. S# = SP. S# Group By S. S# Does it only once per supplier • Less redundancy 21

Implicit Joins and From • Equijoins following the referential semantic links or integrity may

Implicit Joins and From • Equijoins following the referential semantic links or integrity may be implicit • Ms. Access, SQL Server… • FROM clause content can be inferred from the attribute names • Even less procedural formulation may result: select P. SS#, Name, Friend, Rest, Hobby group by P. SS#, Name ; 22

Implementation Issues • Should be easy for the RDBS owner – Any RDB already

Implementation Issues • Should be easy for the RDBS owner – Any RDB already processes the aggregates Already done hiding the list Should also be shown 23

Implementation Issues • For explicit LIST, foreign function interface may suffice – Oracle, DB

Implementation Issues • For explicit LIST, foreign function interface may suffice – Oracle, DB 2, Yukon… – See related work in the paper for current (limited) proposals • Oracle & i. Anywhere (core code) • Not for the implicit LIST – Access to core code is necessary 24

Conclusion • LIST is a new aggregate function • Aggregates a multi-valued attribute into

Conclusion • LIST is a new aggregate function • Aggregates a multi-valued attribute into a single value • Responds to a long-standing fundamental RDBS user need - 30 years ? • Should be rather easy to implement • Future work should start with the implementation – Using foreign functions for explicit LIST 25

Research Support • European Commission ICONS Project – no. IST-2001 -32429. • Microsoft Research

Research Support • European Commission ICONS Project – no. IST-2001 -32429. • Microsoft Research 26

Thank You for Your Attention Witold Litwin Université Paris 9 Dauphine mailto: Witold. litwin@dauphine.

Thank You for Your Attention Witold Litwin Université Paris 9 Dauphine mailto: Witold. litwin@dauphine. fr 27