Querying and storing XML Week 6 XML Updates

  • Slides: 85
Download presentation
Querying and storing XML Week 6 XML Updates February 26 -March 1, 2013 1

Querying and storing XML Week 6 XML Updates February 26 -March 1, 2013 1

XML Updates • Input: an XML tree ΔT and XML update T • Output:

XML Updates • Input: an XML tree ΔT and XML update T • Output: updated XML tree T’ = T + ΔT February 26 -March 1, 2013 QSX 2

Challenges • • • The study of the following is still in its infancy:

Challenges • • • The study of the following is still in its infancy: update languages for XML data • XQuery Update Facility: relatively new standard implementation of XML updates • • native storage of XML data validation, consistency and integrity concurrency control for XML “databases”; crash recovery Question: is there a method to • • • support updates commonly found in practice? provide XML query engines with XML update support? avoid the troubles of concurrency control, consistency checking, etc? February 26 -March 1, 2013 QSX 3

Update Support • How to update? • • • Flat streams: overwrite document (slow)

Update Support • How to update? • • • Flat streams: overwrite document (slow) Colonial: SQL updates? (hard to translate) Native: DOM, proprietary APIs • How do you know you have not violated schema? • • • Flat streams: re-parse document Colonial: need to understand the mapping and translate/maintain integrity constraints Native: supported in some systems (e. g. , e. Xcelon) February 26 -March 1, 2013 QSX 4

Atomic updates February 26 -March 1, 2013 QSX 5

Atomic updates February 26 -March 1, 2013 QSX 5

Atomic updates • Basic changes that can be applied to tree u : :

Atomic updates • Basic changes that can be applied to tree u : : = insert. Into(n, t) | insert. As. First. Into(n, t) | insert. As. Last. Into(n, t) | insert. Before(n, t) | insert. After(n, t) | delete(n) | replace(n, t) | replace. Value(n, s) | rename(n, a) February 26 -March 1, 2013 QSX 6

Atomic updates: insertion • Insert. Into (c, <x><y/></x>) a c b d b x

Atomic updates: insertion • Insert. Into (c, <x><y/></x>) a c b d b x e y February 26 -March 1, 2013 QSX 7

Atomic updates: insertion • Insert. As. First. Into (c, <x><y/></x>) a c b x

Atomic updates: insertion • Insert. As. First. Into (c, <x><y/></x>) a c b x b d e y February 26 -March 1, 2013 QSX 8

Atomic updates: insertion • Insert. As. Last. Into (c, <x><y/></x>) a c b b

Atomic updates: insertion • Insert. As. Last. Into (c, <x><y/></x>) a c b b d e x y February 26 -March 1, 2013 QSX 9

Atomic updates: insertion • Insert. Before (c, <x><y/></x>) a b c x y b

Atomic updates: insertion • Insert. Before (c, <x><y/></x>) a b c x y b d e February 26 -March 1, 2013 QSX 10

Atomic updates: insertion • Insert. After (c, <x><y/></x>) a c b x d e

Atomic updates: insertion • Insert. After (c, <x><y/></x>) a c b x d e b y February 26 -March 1, 2013 QSX 11

Atomic updates: deletion • Delete (c) a c b b d e February 26

Atomic updates: deletion • Delete (c) a c b b d e February 26 -March 1, 2013 QSX 12

Atomic updates: replace text value • Replace. Value (d, "foo") a c b b

Atomic updates: replace text value • Replace. Value (d, "foo") a c b b d e 1234 foo February 26 -March 1, 2013 QSX 13

Atomic updates: replace subtree • Replace (c, <x><y/><z/></x>) a xc b b y d

Atomic updates: replace subtree • Replace (c, <x><y/><z/></x>) a xc b b y d z e February 26 -March 1, 2013 QSX 14

Atomic updates: rename • Rename (c, x) a xc b b d e February

Atomic updates: rename • Rename (c, x) a xc b b d e February 26 -March 1, 2013 QSX 15

Updating XML stored in relations February 26 -March 1, 2013 QSX 16

Updating XML stored in relations February 26 -March 1, 2013 QSX 16

Approaches • We've considered several approaches to storing XML in relations • • naive

Approaches • We've considered several approaches to storing XML in relations • • naive "edge relation: shared inlining Dewey Decimal interval encoding • How can we update data in these representations? • What are the tradeoffs? February 26 -March 1, 2013 QSX 17

Updating XML: naive • Remember this? db o 1 book title o 3 parent

Updating XML: naive • Remember this? db o 1 book title o 3 parent child node. Id tag type o 1 o 2 o 1 db ELT o 2 o 3 o 2 book ELT o 3 o 4 o 3 title TEXT o 2 o 5 o 4 o 5 o 6 o 5 o 2 o 7 o 8 QSX TEXT author o 6 o 7 o 8 Database o 4 Managemen t Systems author o 2 author o 5 o 7 author Ramakrishnao 6 n Gehrke node. Id text TEXT o 4 Database Management Systems ELT o 6 Ramakrishnan TEXT o 8 Gehrke ELT February 26 -March 1, 2013 18 o 8

Updating XML: naive • INSERT INTO Nodes Insert. Into(o 5, <foo/>) VALUE (o 9,

Updating XML: naive • INSERT INTO Nodes Insert. Into(o 5, <foo/>) VALUE (o 9, foo, ELT) INSERT INTO Edges title o 3 VALUE (o 5, o 9) parent child node. Id tag type o 1 o 2 o 1 db ELT o 2 o 3 o 2 book ELT o 3 o 4 o 3 title TEXT o 2 o 5 o 4 o 5 o 6 o 5 o 2 o 7 o 6 o 7 o 8 o 7 o 5 o 9 o 8 QSX o 9 Database o 4 Managemen t Systems TEXT author foo db o 1 book o 2 author o 5 o 7 author fooo 9 Ramakrishnao 6 n Gehrke node. Id text TEXT o 4 Database Management Systems ELT o 6 Ramakrishnan TEXT o 8 Gehrke ELT February 26 -March 1, 2013 19 o 8

Updating XML: naive • DELETE FROM Nodes WHEREDelete(o 5) node. Id=o 5; <Trigger. .

Updating XML: naive • DELETE FROM Nodes WHEREDelete(o 5) node. Id=o 5; <Trigger. . . > DELETE FROM Edges WHERE parent=o 5 title o 3 OR child=o 5; node. Id tag type parent child DELETE FROM Nodes o 1 db ELT Database o 4 o 1 o 2 WHERE node. Id=o 6; o 2 book ELT Managemen o 2 o 3 DELETE FROM Text t Systems o 3 title TEXT o 3 o 4 WHERE node. Id=o 6 o 2 o 5 o 4 o 5 o 6 o 5 o 2 o 7 o 8 QSX TEXT author o 6 o 7 o 8 author db o 1 book o 2 author o 5 o 7 author Ramakrishnao 6 n Gehrke node. Id text TEXT o 4 Database Management Systems ELT o 6 Ramakrishnan TEXT o 8 Gehrke ELT February 26 -March 1, 2013 20 o 8

Updating XML: naive • UPDATE Nodes Rename(o 2, text) SET tag='text' WHERE node. Id=o

Updating XML: naive • UPDATE Nodes Rename(o 2, text) SET tag='text' WHERE node. Id=o 2 db o 1 text book title o 3 parent child node. Id tag type o 1 o 2 o 1 db ELT o 2 o 3 o 2 book text ELT o 3 o 4 o 3 title TEXT o 2 o 5 o 4 o 5 o 6 o 5 o 2 o 7 o 8 QSX TEXT author o 6 o 7 o 8 Database o 4 Managemen t Systems author o 2 author o 5 o 7 author Ramakrishnao 6 n Gehrke node. Id text TEXT o 4 Database Management Systems ELT o 6 Ramakrishnan TEXT o 8 Gehrke ELT February 26 -March 1, 2013 21 o 8

Updating XML: naive • UPDATE Text Replace. Value(o 5, 'Raghu') SET text='Raghu' WHERE node.

Updating XML: naive • UPDATE Text Replace. Value(o 5, 'Raghu') SET text='Raghu' WHERE node. Id=o 6 title o 3 parent child node. Id tag type o 1 o 2 o 1 db ELT o 2 o 3 o 2 book ELT o 3 o 4 o 3 title TEXT o 2 o 5 o 4 o 5 o 6 o 5 o 2 o 7 o 8 QSX TEXT author o 6 o 7 o 8 Database o 4 Managemen t Systems author db o 1 book o 2 author o 5 o 7 author Ramakrishnao 6 Raghu n Gehrke node. Id text TEXT o 4 Database Management Systems ELT o 6 Raghu Ramakrishnan TEXT o 8 Gehrke ELT February 26 -March 1, 2013 22 o 8

Updating XML: shared inlining dbdb 1 book 1 authorauth 1 title Database Managemen t

Updating XML: shared inlining dbdb 1 book 1 authorauth 1 title Database Managemen t Systems db. ID db 1 QSX book. ID parent. ID code book 1 db. ID author. ID book. ID 0 authorauth 2 Ramakrishna n Gehrke title Database Management Systems auth 1 Ramakrishna book 1 n auth 2 book 1 Gehrke February 26 -March 1, 2013 23

Updating XML: shared inlining dbdb 1 Insert. Into(auth 1, <foo/>) book 1 authorauth 1

Updating XML: shared inlining dbdb 1 Insert. Into(auth 1, <foo/>) book 1 authorauth 1 title Database Not allowed by Managemen schema! t Systems db. ID db 1 QSX book. ID parent. ID code book 1 db. ID author. ID book. ID 0 authorauth 2 Ramakrishna n Gehrke title Database Management Systems auth 1 Ramakrishna book 1 n auth 2 book 1 Gehrke February 26 -March 1, 2013 24

Updating XML: shared inlining dbdb 1 DELETE FROM authors Delete(auth 1) WHERE author. Id=auth

Updating XML: shared inlining dbdb 1 DELETE FROM authors Delete(auth 1) WHERE author. Id=auth 1 book 1 authorauth 1 title Database Managemen t Systems db. ID db 1 QSX book. ID parent. ID code book 1 db. ID author. ID book. ID auth 1 auth 2 0 authorauth 2 Ramakrishna n Gehrke title Database Management Systems Ramakrishna book 1 n book 1 Gehrke 25 February 26 -March 1, 2013

Updating XML: shared inlining dbdb 1 Rename(book 1, text) book 1 authorauth 1 title

Updating XML: shared inlining dbdb 1 Rename(book 1, text) book 1 authorauth 1 title Database Not allowed by Managemen schema! t Systems db. ID db 1 QSX book. ID parent. ID code book 1 db. ID author. ID book. ID 0 authorauth 2 Ramakrishna n Gehrke title Database Management Systems auth 1 Ramakrishna book 1 n auth 2 book 1 Gehrke February 26 -March 1, 2013 26

Updating XML: shared inlining UPDATE authors Replace. Value(auth 1. author, 'Raghu') SET author='Raghu' WHERE

Updating XML: shared inlining UPDATE authors Replace. Value(auth 1. author, 'Raghu') SET author='Raghu' WHERE author. Id=auth 1 dbdb 1 book 1 authorauth 1 title Database Managemen t Systems db. ID db 1 QSX book. ID parent. ID code book 1 db. ID author. ID book. ID auth 1 auth 2 0 authorauth 2 Ramakrishna Raghu n Gehrke title Database Management Systems book 1 Ramakrishna Raghu book 1 n book 1 Gehrke 27 February 26 -March 1, 2013

Naive/Shared inlining: summary • Rename, replace value, insert relatively straightforward • when allowed by

Naive/Shared inlining: summary • Rename, replace value, insert relatively straightforward • when allowed by schema (how to check this? ) • Delete can require recursion or triggers • • can avoid this for a non-recursive schema (or delete of an element with no recursive children) since there is a fixed upper bound on subtree depth February 26 -March 1, 2013 QSX 28

Updating XML: Dewey Decimal • Remember this? db [] book title 1. 1 author

Updating XML: Dewey Decimal • Remember this? db [] book title 1. 1 author 1 tag type [] db ELT 1 book ELT 1. 1 title ELT 1. 1. 1 1. 3 1. 2 node. I D author 1. 2 TEXT author 1. 2. 1 Database 1. 1. 1 Ramakrishna Managemen n t Systems Gehrke 1. 3. 1 ELT TEXT author ELT TEXT February 26 -March 1, 2013 QSX 29

 • Updating XML: Dewey Decimal INSERTInsert. As. First. Into(1. 2, foo) INTO Nodes

• Updating XML: Dewey Decimal INSERTInsert. As. First. Into(1. 2, foo) INTO Nodes VALUE (1. 2. 0, 'foo', 'ELT') db [] book title 1. 1 author tag type [] db ELT 1 book ELT 1. 1 title ELT 1. 1. 1 1 1. 2 1. 3 1. 2 node. I D author foo 1. 2. 0 1. 2. 1 Database 1. 1. 1 Ramakrishna Managemen n t Systems Gehrke TEXT author 1. 2. 1 1. 3. 1 TEXT author 1. 3. 1 1. 2. 0 ELT TEXT foo ELT February 26 -March 1, 2013 QSX 30

 • Updating XML: Dewey Decimal INSERT INTO Nodes Insert. Before(1. 2, foo) Problem!

• Updating XML: Dewey Decimal INSERT INTO Nodes Insert. Before(1. 2, foo) Problem! VALUE (1. ? ? ? , 'foo', 'ELT') Doesn't fit. Need to shift labels db [] (expensive!) book title 1. 1 foo 1. ? ? ? Database 1. 1. 1 Managemen t Systems author tag type [] db ELT 1 book ELT 1. 1 title ELT 1. 1. 1 1 1. 2 1. 3 1. 2 node. I D author 1. 3 Gehrke author 1. 2. 1 Ramakrishna n TEXT 1. 3. 1 TEXT author 1. 3. 1 1. ? ? ? ELT TEXT foo ELT February 26 -March 1, 2013 QSX 31

Updating XML: Dewey Decimal UPDATE <node. IDs> INSERT INTO Nodes VALUE (1. 2, 'foo',

Updating XML: Dewey Decimal UPDATE <node. IDs> INSERT INTO Nodes VALUE (1. 2, 'foo', 'ELT') db [] book title 1. 1 foo Database 1. 1. 1 Managemen t Systems author tag type [] db ELT 1 book ELT 1. 1 title ELT 1. 1. 1 1 1. 3 1. 4 1. 3 node. I D author 1. 4 1. 3. 1 Ramakrishna Gehrke author 1. 3. 1 1. 2 n TEXT 1. 4. 1 TEXT author 1. 4. 1 1. 2 ELT TEXT foo ELT February 26 -March 1, 2013 QSX 32

 • Updating XML: Dewey Decimal DELETE x from Nodes Delete(1. 2) WHERE PREFIX([1.

• Updating XML: Dewey Decimal DELETE x from Nodes Delete(1. 2) WHERE PREFIX([1. 2], x. node. Id) db [] book title 1. 1 author 1 tag type [] db ELT 1 book ELT 1. 1 title ELT 1. 1. 1 1. 3 1. 2 node. I D author 1. 2 TEXT author 1. 2. 1 Database 1. 1. 1 Ramakrishna Managemen n t Systems Gehrke 1. 3. 1 ELT TEXT author ELT TEXT February 26 -March 1, 2013 QSX 33

 • Updating XML: Dewey Decimal UPDATE Nodes x Rename(1, 'text') SET x. tag

• Updating XML: Dewey Decimal UPDATE Nodes x Rename(1, 'text') SET x. tag = 'text' WHERE x. node. Id = 1 db [] text book title 1. 1 author 1 tag type [] db ELT 1 book text ELT 1. 1 title ELT 1. 1. 1 1. 3 1. 2 node. I D author 1. 2 TEXT author 1. 2. 1 Database 1. 1. 1 Ramakrishna Managemen n t Systems Gehrke 1. 3. 1 ELT TEXT author ELT TEXT February 26 -March 1, 2013 QSX 34

 • Updating XML: Dewey Decimal UPDATE Text x Replace. Value(1. 2. 1, Raghu)

• Updating XML: Dewey Decimal UPDATE Text x Replace. Value(1. 2. 1, Raghu) SET x. txt = 'Raghu' WHERE x. node. Id = '1. 2' db [] book title 1. 1 author 1 1. 2. 1 Database 1. 1. 1 Ramakrishna Raghu Managemen n t Systems node. Id tag type D [] db ELT 1 book ELT 1. 1 title ELT node. Id 1. 1. 1 text. TEXT 1. 1. 1 TEXT Database Management 1. 3 1. 2 author ELT 1. 1. 1 author 1. 2 author ELT Systems 1. 2. 1 TEXT 1. 2. 1 Raghu 1. 2. 1 Ramakrishnan TEXT 1. 3 author ELT 1. 3. 1 Gehrke 1. 3 author ELT 1. 3. 1 TEXT Gehrke February 26 -March 1, 2013 QSX 35

Updating XML: interval encodings • Remember the interval approach: begi end par n 1

Updating XML: interval encodings • Remember the interval approach: begi end par n 1 db 16 2 book 15 3 title 7 6 Database Managemen t Systems 5 4 8 14 author 10 11 author Ramakrishna 9 n 12 Gehrke 13 tag type db ELT 1 16 2 15 1 book ELT 3 6 2 title ELT 4 5 3 7 10 2 8 9 7 11 14 2 12 13 11 TEXT author ELT TEXT February 26 -March 1, 2013 QSX 36

 • Updating XML: interval encodings UPDATE x from Nodes Insert. Before(8, 9, <foo/>)

• Updating XML: interval encodings UPDATE x from Nodes Insert. Before(8, 9, <foo/>) trickier SET x. begin=x. begin+2 Problem! begi WHERE x. begin >=Doesn't 8 end par tag fit. db n 18 1 16 UPDATE x from Nodes 1 18 db to shift labels 16 SET x. par =2 Need x. par+2 17 book 15 2 2 17 15 1 1 book WHERE x. par >= (expensive!) 8 16 14 12 13 author 10 title 6 11 author 3 6 title x 7 FROM Nodes 3 UPDATE 3 6 2 2 title 4 5 3 SET x. end = x. end+2 4 5 3 8 9 11 9 Database Ramakrishna WHEREfoo x. end >= 8 7 12 2 author 10 8 Managemen t Systems 5 4 n 14 12 Gehrke INSERT INTO Nodes VALUES (8, 9, 7, foo, ELT) 13 15 7 8 8 10 11 13 10 9 9 11 14 16 2 7 7 7 2 2 12 14 13 13 11 15 type ELT ELT ELT TEXT ELT author ELT foo ELT TEXT author ELT author TEXT February 26 -March 1, 2013 QSX 37

Updating XML: interval encodings DELETE x from Nodes • Delete(7, 10) - easy WHERE

Updating XML: interval encodings DELETE x from Nodes • Delete(7, 10) - easy WHERE 7 <= x. begin 1 db 16 AND x. end <= 10 begi end par n 2 book 15 3 title 7 6 Database Managemen t Systems 5 4 8 14 author 10 11 author Ramakrishna 9 n 12 Gehrke gaps are fine 13 tag type db ELT 1 16 2 15 1 book ELT 3 6 2 title ELT 4 5 3 7 10 2 8 9 7 11 14 2 12 13 11 TEXT author ELT TEXT February 26 -March 1, 2013 QSX 38

Updating XML: interval encodings UPDATE x from Nodes • Rename(2, 15, "text") - easy

Updating XML: interval encodings UPDATE x from Nodes • Rename(2, 15, "text") - easy SET x. tag='text' begi end par db n 1 16 WHERE x. node. Id =8 2 text book 15 3 title 7 6 Database Managemen t Systems 5 4 8 14 author 10 11 author Ramakrishna 9 n 12 Gehrke 13 tag type db ELT 1 16 2 15 1 book text ELT 3 6 2 title ELT 4 5 3 7 10 2 8 9 7 11 14 2 12 13 11 TEXT author ELT TEXT February 26 -March 1, 2013 QSX 39

Updating XML: interval encodings UPDATE x from Nodes • Replace. Val(8, 9, "Raghu") -

Updating XML: interval encodings UPDATE x from Nodes • Replace. Val(8, 9, "Raghu") - easy SET x. text='Raghu' begi end par db n 1 16 WHERE x. node. Id =8 1 16 2 book 15 3 title 6 Database Managemen t Systems 5 4 7 2 14 author 10 11 author 9 Ramakrishna 8 Raghu n 12 Gehrke 13 15 1 tag type db ELT book ELT 3 6 2 title ELT node. Id text 4 Database 5 3 Management TEXT 4 7 10 2 Systems author ELT 8 8 9 Ramakrishnan 7 Raghu TEXT 12 Gehrke 11 14 2 author ELT 12 13 11 TEXT February 26 -March 1, 2013 QSX 40

Updating interval encodings: summary • Replace = delete + insert • can recycle "gap"

Updating interval encodings: summary • Replace = delete + insert • can recycle "gap" left by delete, if inserted tree smaller • Over time, deletes + inserts lead to "gaps" • • Naive: export DB as XML and re-import (O(n) traversal of db though). periodically clean up by finding gaps and shrinking them? • Some work on leaving "holes" to decrease amortized cost of insertion February 26 -March 1, 2013 QSX 41

 • • Atomic updates: summary Atomic updates: primitive change operators on XML data

• • Atomic updates: summary Atomic updates: primitive change operators on XML data How can they be implemented? • • • Shared inlining: using recursive updates/triggers Dewey: can translate to SQL using prefix, length Interval: can translate to SQL using < All of these are expensive for some operations • • Some recent work on "dynamic DDE" avoids this for Dewey approach at cost of making XPath steps more complex There has been no comprehensive comparison of these approaches (AFAIK) February 26 -March 1, 2013 QSX 42

XQuery Update Facility February 26 -March 1, 2013 QSX 43

XQuery Update Facility February 26 -March 1, 2013 QSX 43

Bulk updates • Atomic updates allow for changing one thing at a time •

Bulk updates • Atomic updates allow for changing one thing at a time • Widely supported • But tedious to: • • delete all 2012 students increase all salaries where employee is in top 10% of sales • Solution: XQuery Update Facility • extends XQuery to allow updating February 26 -March 1, 2013 QSX 44

Compare with SQL • • Besides queries, SQL has "Data Manipulation Language" • •

Compare with SQL • • Besides queries, SQL has "Data Manipulation Language" • • i. e. , bulk updates which we've already seen in action Table updates: • • • INSERT INTO table VALUES r 1, . . . , rn DELETE FROM table WHERE condition UPDATE table SET t. a = v. . . WHERE condition But also create/delete tables (CREATE TABLE, DROP TABLE), add/remove columns, (ALTER TABLE) etc. For XML, no built-in distinction between table, record, field: need uniform mechanism for updates February 26 -March 1, 2013 QSX 45

Updating XML [Tatarinov et al. ] • First paper to propose XQuery extensions for

Updating XML [Tatarinov et al. ] • First paper to propose XQuery extensions for XML updates • Idea: • • • use XPath/XQuery operations to select target nodes of updates (including conditional tests) use XQuery to construct subtrees (for insert/replace) Create a pending update list that is applied in one shot • XQuery Update Facility uses similar approach February 26 -March 1, 2013 QSX 46

Example XPathbased updates db course cno title prereq cs 101 Advanced Quantum Query Languages

Example XPathbased updates db course cno title prereq cs 101 Advanced Quantum Query Languages course . . . course insert node <course>. . . </course> into //course[cno='cs 101']/prereq February 26 -March 1, 2013 QSX 47

Example XPathbased updates db course cno title prereq cs 101 Advanced Quantum Query Languages

Example XPathbased updates db course cno title prereq cs 101 Advanced Quantum Query Languages course . . . course delete nodes //course[cno='cs 101']/title February 26 -March 1, 2013 QSX 48

XQuery Update Facility • Extends queries with updating expressions u : : = insert

XQuery Update Facility • Extends queries with updating expressions u : : = insert node(s) exp (as first|as last) into path | insert node(s) exp (before|after) path | delete node(s) path | replace node(s) path with exp | replace value of node path with exp | rename node path as name February 26 -March 1, 2013 QSX 49

Insert Into insert node(s) exp (as first|as last) into path • where exp is

Insert Into insert node(s) exp (as first|as last) into path • where exp is an XQuery expression that constructs subtree value • and path is an XPath expression that selects node(s) • • • these will be parents of inserted exp "as first"/"as last" govern where nodes inserted if not specified, it's up to implementation February 26 -March 1, 2013 QSX 50

Example insert node <x><y/></x> as first into /a/* a c b x x y

Example insert node <x><y/></x> as first into /a/* a c b x x y y b d e x y February 26 -March 1, 2013 QSX 51

Insert Before/After insert node(s) exp (before|after) into path • where exp is an XQuery

Insert Before/After insert node(s) exp (before|after) into path • where exp is an XQuery expression that constructs subtree value • and path is an XPath expression that selects node(s) • • • these will be siblings of inserted exp "as first"/"as last" govern where nodes inserted if not specified, it's up to implementation February 26 -March 1, 2013 QSX 52

Example insert node <x><y/></x> before /a/b a x y c b b x d

Example insert node <x><y/></x> before /a/b a x y c b b x d e y February 26 -March 1, 2013 QSX 53

Example: Copy-paste for $x in /a/c insert node $x before /a/*[3] a c b

Example: Copy-paste for $x in /a/c insert node $x before /a/*[3] a c b b c d e February 26 -March 1, 2013 QSX 54

Other updates • delete: obvious (delete selected nodes) • replace: similar to delete +

Other updates • delete: obvious (delete selected nodes) • replace: similar to delete + insert • replace value allows changing string values • rename: changes name while keeping structure fixed February 26 -March 1, 2013 QSX 55

Evaluating complex updates • Update evaluation is multi-stage / snapshot: • Evaluate query &

Evaluating complex updates • Update evaluation is multi-stage / snapshot: • Evaluate query & update expressions to form pending update list (PUL) • • Check PUL is minimally sensible • • all constructed values are built using old version e. g. does not rename same node to two different names Finally, reorder & apply updates • Good in that it avoids strange behaviors • But this may not do what you expect! February 26 -March 1, 2013 QSX 56

Example $x / c a a b a delete $x//a, insert <foo>bar</foo> before $x//a

Example $x / c a a b a delete $x//a, insert <foo>bar</foo> before $x//a February 26 -March 1, 2013 QSX 57

First collect updates / / c a b a a a b a delete

First collect updates / / c a b a a a b a delete $x//a, insert <foo>bar</foo> before $x//a February 26 -March 1, 2013 QSX 58

First collect updates c a / / / c a b a a c

First collect updates c a / / / c a b a a c a b a foo a a foo b foo a delete $x//a, insert <foo>bar</foo> before $x//a February 26 -March 1, 2013 QSX 59

Then reorder & apply c a / / / c a b a a

Then reorder & apply c a / / / c a b a a c a b a foo a a foo b foo a delete $x//a, insert <foo>bar</foo> before $x//a February 26 -March 1, 2013 QSX 60

XQuery Update Facility: Transforms • Queries can perform updates locally u : : =

XQuery Update Facility: Transforms • Queries can perform updates locally u : : = insert node(s) exp (as first|as last) into path | insert node(s) exp (before|after) path | delete node(s) path | replace node(s) path with exp | replace value of node path with exp | rename node path as name q : : = copy $x : = exp, . . . modify u return exp February 26 -March 1, 2013 QSX 61

Transform queries (hypothetical) copy $x 1 : = exp 1, . . . ,

Transform queries (hypothetical) copy $x 1 : = exp 1, . . . , $xn : = expn modify u return exp • Evaluate exp , bind to $x , . . . • Evaluate exp , bind to $x • Apply update u • 1 1 n n only $x 1. . . $xn are mutable, other vars cannot be updated • Evaluate & return exp • Key point: No side-effects on database February 26 -March 1, 2013 QSX 62

Example $x / c a a b a copy $y : = $x/a modify

Example $x / c a a b a copy $y : = $x/a modify insert nodes $x/c as first into $y return <result>$y</result> February 26 -March 1, 2013 QSX 63

Example: Make copy $x $y / a c a b a a copy $x

Example: Make copy $x $y / a c a b a a copy $x : = $doc/a modify insert nodes $doc/c as first into $x return <result>$x</result> February 26 -March 1, 2013 QSX 64

Example: Apply updates to copy $x $y / a c a b b a

Example: Apply updates to copy $x $y / a c a b b a a a No side copy $x : = $doc/a effects on $xmodify insert nodes $doc/c as first into $x return <result>$x</result> February 26 -March 1, 2013 QSX 65

Example: Final result $x / a c a b b a a a copy

Example: Final result $x / a c a b b a a a copy $x : = $doc/a modify insert nodes $doc/c as first into $x return <result>$x</result> February 26 -March 1, 2013 QSX 66

Incremental maintenance of XML views February 26 -March 1, 2013 QSX 67

Incremental maintenance of XML views February 26 -March 1, 2013 QSX 67

Goal: Incremental Update February 26 -March 1, 2013 QSX 68

Goal: Incremental Update February 26 -March 1, 2013 QSX 68

Why incremental update? [Bohannon et al. 2004] • • Goal: update external materialized XML

Why incremental update? [Bohannon et al. 2004] • • Goal: update external materialized XML tree in response to changes ΔI to the underlying database Batch computation: recompute the entire tree from scratch; • large XML views may take multiple hours or days to produce! Incremental computation: compute XML change ΔT • • • Idea: the new view T’ = the old view T + ΔT Why? the new view T’ often differs only slightly from the old view T – reuse partial results computed earlier Typically more efficient to compute ΔT (small) and update the old view T with ΔT Incremental computation: an effective technique with a wide range of applications February 26 -March 1, 2013 QSX 69

Coping with source updates • • Problem: the underlying database may be updated constantly

Coping with source updates • • Problem: the underlying database may be updated constantly (ΔI) • e. g. movies database - new movies coming out all the time Goal: update the published (materialized) XML tree in response to source changes ΔI -- updating the treatment hierarchies Incrementally compute XML change ΔT such that the new view T’ = the old view T + ΔT February 26 -March 1, 2013 QSX 70

Example Actors aid 1 2 lname fname Maguire Tobey Dunst Kirsten Movies mid title

Example Actors aid 1 2 lname fname Maguire Tobey Dunst Kirsten Movies mid title year 11 Spider-Man 2002 32 Elizabethtow 2005 n Appears QSX mid aid 11 1 11 2 32 2 <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year> <Actor $Movie : = id="1"> select mid, title, year <LName>Maguire</LName> from Movies <FName>Tobey</FName> </Actor> <Actor id="2"> <LName>Dunst</LName> $id : = $Movie. mid $year = $Movie. year $title : =<FName>Kirsten</FName> $Movie. title $Actors = $Movie. mid </Actor> </Movie> <Movie id="32"> $Actor : = select a. aid, a. lname, a. fname <Title>Elizabethtown</Title> from actors a, appears app <Year>1999</Year> where app. mid=$Actors. mid, app. aid=a. aid <Actor id="2"> <LName>Dunst</LName> <FName>Kirsten</FName> $id : = $Actor. aid </Actor> $lname : = $Actor. lname </Movie> $fname : = $Actor. fname doc -> Movie* Movie -> id, title, year, Actors -> Actor* Actor -> id, lname, fname $fname : = $Actor. fname February 26 -March 1, 2013 71

Example insert Movies(42, Star Wars, 1977) Actors aid 1 2 lname fname Maguire Tobey

Example insert Movies(42, Star Wars, 1977) Actors aid 1 2 lname fname Maguire Tobey Dunst Kirsten Movies mid 11 title year Spider-Man 2002 Elizabethtow 32 2005 n n Appears 42 Star Wars 1977 mid aid QSX 11 1 11 2 32 2 <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year> <Actor id="1"> <LName>Maguire</LName> <FName>Tobey</FName> </Actor> $Movie : = select mid, title, year <Actor id="2"> from Movies <LName>Dunst</LName> <FName>Kirsten</FName> </Actor> $id : = $Movie. mid </Movie> $year = $Movie. year $title : = $Movie. title $Actors = $Movie. mid <Movie id="32"> <Title>Elizabethtown</Title> <Year>1999</Year> <Actor id="2"> $Actor : = select a. aid, a. lname, a. fname <LName>Dunst</LName> from actors a, appears app <FName>Kirsten</FName> where app. mid=$Actors. mid, app. aid=a. aid </Actor> </Movie> <Movie id="42"> $id : = $Actor. aid <Title>Star Wars</Title> $lname : = $Actor. lname <Year>1977</Year> $fname : = $Actor. fname </Movie> doc -> Movie* Movie -> id, title, year, Actors -> Actor* Actor -> id, lname, fname February 26 -March 1, 2013 72

Example {+Movies(42, Star Wars, 1977)} Actors aid 1 2 lname fname Maguire Tobey Dunst

Example {+Movies(42, Star Wars, 1977)} Actors aid 1 2 lname fname Maguire Tobey Dunst Kirsten mid title year 11 Spider-Man 2002 32 Elizabethtow 2005 n QSX +<Movie>. . . </Movie> $Movie : = select mid, title, year from Movies Movie -> id, title, year, Actors +@id='42', <Title>Star Wars</Title><Year>1977</Year>. . . Movies Appears 42 doc -> Movie* Star Wars mid aid 11 1 11 2 32 2 1977 $id : = $Movie. mid $year = $Movie. year $title : = $Movie. title $Actors = $Movie. mid Actors -> Actor* + no new actors $Actor : = select a. aid, a. lname, a. fname from actors a, appears app where app. mid=$Actors. mid, app. aid=a. aid Actor -> id, lname, fname $id : = $Actor. aid $lname : = $Actor. lname $fname : = $Actor. fname February 26 -March 1, 2013 73

More complex example insert Actors(3, 'Bloom', 'Orlando') Actors aid 1 lname fname Maguire Tobey

More complex example insert Actors(3, 'Bloom', 'Orlando') Actors aid 1 lname fname Maguire Tobey Dunst Kirsten Trickier - need 2 2 to find 3 Bloom Orlando Movies keys/paths to mid title year subtrees that insert Appears(32, 3) 11 Spider-Man 2002 delete Appears(11, 2) need Elizabethtow to change 32 Appears QSX 2005 n mid aid 11 1 11 2 32 3 <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year> $Movie : = select <Actor id="1">mid, title, year <LName>Maguire</LName> from Movies <FName>Tobey</FName> </Actor> <!-- deleted --> $id : = $Movie. mid $year = $Movie. year </Movie> $title : = $Movie. title $Actors = $Movie. mid <Movie id="32"> <Title>Elizabethtown</Title> <Year>1999</Year> <Actor id="2"> $Actor : = select a. aid, a. lname, a. fname from<LName>Dunst</LName> actors a, appears app where app. mid=$Actors. mid, app. aid=a. aid <FName>Kirsten</FName> </Actor> <Actor id="3"> <LName>Bloom</LName> $id : = $Actor. aid $lname : =<FName>Orlando</FName> $Actor. lname </Actor> $fname : = $Actor. fname </Movie> doc -> Movie* Movie -> id, title, year, Actors -> Actor* Actor -> id, lname, fname February 26 -March 1, 2013 74

Updating XML views of relations February 26 -March 1, 2013 QSX 75

Updating XML views of relations February 26 -March 1, 2013 QSX 75

View updates • XML view updates: propagation from XML to relations • • Input:

View updates • XML view updates: propagation from XML to relations • • Input: a mapping σ from DB R to XML T, and XML updates ΔT Output: updated database R such that T + ΔT = σ (R + ΔR) • Already hard for relational views February 26 -March 1, 2013 QSX 76

View updates: hard even for relational views • • • Input: a relational view

View updates: hard even for relational views • • • Input: a relational view definition σ, an instance of relational database I of schema R, a view V = σ(I), and view updates ΔV Output: database updates ΔI such that V + ΔV = σ (I + ΔI) May not be updatable: • • Schema: R(A, B), S(B, C); View: ΠAC (R � S) View delete: remove (a 1, c 1). Not doable without side effect (the deletion of (a 1, c 2) or (a 2, c 1)) February 26 -March 1, 2013 QSX 77

More on relational view updates • May not have a unique answer • •

More on relational view updates • May not have a unique answer • • Schema: R(A, B), S(B, C); View: ΠAC (R � S) View delete: remove (a 1, c 1). Not doable without side effect (the deletion of (a 1, c 2) or (a 2, c 1)) February 26 -March 1, 2013 QSX 78

Complexity of relational updates • [Buneman et al. 2002] View updatability problem: • •

Complexity of relational updates • [Buneman et al. 2002] View updatability problem: • • given a relational view definition σ, an instance of relational database I of schema R, a view V = σ(I), and view updates ΔV decide whether the view is updatable, ie, whethere exists a side-effect-free database update ΔI such that V + ΔV = σ (I + ΔI) NP-hard for relational views defined with Projection+Join (PJ) or Join+Union (JU) only, even for only deletions Minimal view update problem: • • • given a relational view definition σ, an instance of relational database I of schema R, a view V = σ(I), and view insertions (resp. deletions) ΔV find the smallest database update ΔI such that V + ΔV = σ (I + ΔI) NP-hard for relational views defined with PJ or JU only, even for only deletions February 26 -March 1, 2013 QSX 79

XML view updates • • • Input: an ATG σ from DB R to

XML view updates • • • Input: an ATG σ from DB R to XML T, and XML updates ΔT Output: updated database R such that T + ΔT = σ (R + ΔR) XML updates: • • insert e into p delete p where p: XPath query; e: an XML element/subtree Suppose that T is stored as edge relations – relational view V Approach: • • Translate ΔT to ΔV Resolve the relational view update problem: from ΔV to base relational updates ΔR February 26 -March 1, 2013 QSX 80

Example delete /Movie[@id=11]/Actor[@id=2] Actors aid lname fname <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year> <Actor id="1"> <LName>Maguire</LName>

Example delete /Movie[@id=11]/Actor[@id=2] Actors aid lname fname <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year> <Actor id="1"> <LName>Maguire</LName> <FName>Tobey</FName> </Actor> <Actor id="2"> <LName>Dunst</LName> <FName>Kirsten</FName> </Actor> </Movie> <Movie id="32"> <Title>Elizabethtown</Title> <Year>1999</Year> <Actor id="2"> <LName>Dunst</LName> <FName>Kirsten</FName> </Actor> </Movie> delete Kirsten Dunst 1 Maguire Tobey 2 from Dunst. Spiderman? Kirsten Movies Another way: delete mid title year But this has side. Actors(2, Dunst, Kirsten) 11 Spider-Man 2002 effects! Elizabethtow 32 2005 way: delete One n Appears(11, 2) mid aid QSX 11 1 11 2 32 2 February 26 -March 1, 2013 82

Another tricky one insert nodes <Actor id='1'>. . . </Actor But this violates key

Another tricky one insert nodes <Actor id='1'>. . . </Actor But this violates key into /Movie[@id='32'] Actors aid 1 2 2 1 Movies lname fname Maguire Tobey Dunst Kirsten Maguire Tobey (and has side effects) <Movie id="11"> <Title>Spider-Man</Title> <Year>2002</Year>. . . </Movie> <Movie id="32"> <Title>Elizabethtown</Title> <Year>1999</Year> <Actor id="2"> <LName>Dunst</LName> <FName>Kirsten</FName> </Actor> </Movie> <Actor id="1"> <LName>Maguire</LName> <FName>Tobey</FName> </Actor> </Movie> Another way: insert mid title year Actors(1, Maguire, Tobe insert Tobey Maguire 11 Spider-Man 2002 into Elizabethtown? y) 32 2005 way: add to One and Appears(32, 1) n Appears mid aid QSX 11 1 11 2 32 1 February 26 -March 1, 2013 83

Summary • XQuery Update Facility • • allows for bulk updates semantics somewhat complex

Summary • XQuery Update Facility • • allows for bulk updates semantics somewhat complex • Updating views of XML stored in relations • • view update: challenging in general special case: XPath on ATG-defined publishing views • XML/semistructured updates & optimizations remain active research topics February 26 -March 1, 2013 QSX 84

Summary; research areas • Controlling when/how updates performed • • snapshot (all at once

Summary; research areas • Controlling when/how updates performed • • snapshot (all at once at end) vs. explicit commit concurrency control • Dynamic optimization of updates • Translating XML updates to relational updates • Typechecking results of updates • Analyzing when update interferes with query • avoiding view recomputation February 26 -March 1, 2013 QSX 85

Next time • XML typechecking & static analysis • • • XDuce: A statically

Next time • XML typechecking & static analysis • • • XDuce: A statically typed XML processing language Static analysis for path correctness of XML queries Schema-based independence analysis for XML updates February 26 -March 1, 2013 QSX 86