What is a Query Language Universality of Data

  • Slides: 19
Download presentation
What is a Query Language? Universality of Data Retrieval Languages, Aho and Ullman, POPL

What is a Query Language? Universality of Data Retrieval Languages, Aho and Ullman, POPL 1979 Raghu Ramakrishnan CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 1

What is …? v What Is A Query Language? • A language that allows

What is …? v What Is A Query Language? • A language that allows retrieval and manipulation of data From a database. v What Is A Database? • A large collection of DATA • The data can be grouped into sets whose elements have similar structure. v v What Kind of Structure Can the Data Have? What Kind of Manipulation Should Be Allowed? CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 2

Some Ideas Relations should be treated as sets of tuples. v The query language

Some Ideas Relations should be treated as sets of tuples. v The query language must have a simple, nonoperational meaning that is independent of physical data representation. v There must be efficient ways to process queries over (large) sets of similarly structured facts. v We will focus on the relational model CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 3

Principles for A Relational Query Language* * Proposed by Aho & Ullman 1) Relation

Principles for A Relational Query Language* * Proposed by Aho & Ullman 1) Relation = Set of Tuples. Ordering & other storage details should not be visible. 2) Data Values should not be ‘Interpreted’. Note: (2) Says that no special meaning should be attached to data values (as far as the query language is concerned); thus, Arithmetic is Disallowed! 5+6 = 11, 8<9, … CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 4

Principles – Refinement v v Principle (2) is too restrictive. Relax it slightly: Note:

Principles – Refinement v v Principle (2) is too restrictive. Relax it slightly: Note: If we include +, ×, etc. to P, soon only the identity function will preserve P! CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 5

Allowable Fns – Transitive Closure v Aho & Ullman’s notation of allowable function is

Allowable Fns – Transitive Closure v Aho & Ullman’s notation of allowable function is rather restrictive. However: 1. All Relational Algebra queries are allowable. 2. Transitive Closure is allowable. v And they prove that: • There is no Relational Algebra query that computes the Transitive Closure of a Relation. Any R. A expression has a fixed size, say n. Choose Relation R: a 1 a 2 ak k>n The relational algebra expression cannot deal with (a 1, ak). CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 6

Proposal v We should extent RA to support a least fixpoint operator. • Leads

Proposal v We should extent RA to support a least fixpoint operator. • Leads to recursive queries • Some systems (e. g. , Oracle) support limited forms of recursion like transitive closure. Others (DB 2) support linear recursion, following SQL: 1999. CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 7

Least Fixpoints v v The LFP operator is defined as follows: Theorem (Tarski): There

Least Fixpoints v v The LFP operator is defined as follows: Theorem (Tarski): There is a least fixpoint satisfying monotone. LFP(R=f(R)) if ‘f ’ is Note: If ‘f’ is a relation algebra expression without ‘−’ (set diff. ), then it is monotone. CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 8

Least Fixpoint – Cont. v Theorem (Kleene) v Example: Transitive Closure CS 286, UC

Least Fixpoint – Cont. v Theorem (Kleene) v Example: Transitive Closure CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 9

LFP - Cont. v Claim: The LFP operator satisfies principles 1&2 v Theorem (Aho-Ullman):

LFP - Cont. v Claim: The LFP operator satisfies principles 1&2 v Theorem (Aho-Ullman): There is no relational algebra expression E(R) that computes the transitive closure of an arbitrary input relation R. CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 10

Proof a 1 a 2 CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan

Proof a 1 a 2 CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan a 3 al 11

a 1 a 2 am-c am bj bj+c al Note: Here (bj+c) ≡ am

a 1 a 2 am-c am bj bj+c al Note: Here (bj+c) ≡ am s. t. bj=am-c CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 12

ai am am+c c CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan am+d

ai am am+c c CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan am+d 13

am am+d b 2 b 1 CS 286, UC Berkeley, Spring 2007 , R.

am am+d b 2 b 1 CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 14

Proof of lemma CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 15

Proof of lemma CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 15

Transitive closure - more a 1 a 2 a 3 CS 286, UC Berkeley,

Transitive closure - more a 1 a 2 a 3 CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan al 16

Transitive closure - more a 1 a 2 a 3 al YES! But it

Transitive closure - more a 1 a 2 a 3 al YES! But it is NOT a relation algebra expression! a 1 a 2 What does “ai<aj” mean now? ! a 4 a 3 CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 17

BP-Completeness v A query language is BP-complete if: • All functions that can be

BP-Completeness v A query language is BP-complete if: • All functions that can be expressed in the language are allowable. • Let r 1 and r 2 be two relations (instances), such that for all renamings μ Then there is a function f in the language such that CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 18

Example of BP-Complete A B C D E F 5 6 5 6 5

Example of BP-Complete A B C D E F 5 6 5 6 5 6 6 5 7 8 6 5 6 5 7 8 10 11 7 8 8 7 5 5 6 6 1. If ‘A’ is used as ‘r 1’ in previous slide, which of the others qualifies as ‘r 2’? 2. For each such relation, find relational algebra function f. CS 286, UC Berkeley, Spring 2007 , R. Ramakrishnan 19