Indexing of XML Data Raghuraman Rangarajan KRe SIT

  • Slides: 16
Download presentation
Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop,

Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop, IIT Bombay

Plan of Talk n n n Why is indexing needed? Queries and Indexes in

Plan of Talk n n n Why is indexing needed? Queries and Indexes in Traditional DBMS Querying in XML Indexes: Path, Value Conclusion September 2000 XML Workshop, IIT Bombay

Why is Indexing Needed? n n Allows fast access to data by replicating portions

Why is Indexing Needed? n n Allows fast access to data by replicating portions of the data in special purpose structures. Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries. September 2000 XML Workshop, IIT Bombay

Queries and Indexes in Traditional DBMS Databases Query Example SELECT name Relational Associative FROM

Queries and Indexes in Traditional DBMS Databases Query Example SELECT name Relational Associative FROM account WHERE acct. No =14 OO September 2000 Path Expressions SELECT X. name FROM dept. empl X XML Workshop, IIT Bombay

An XML Fragment part subpart supplier name supplier address name (with name subpart name

An XML Fragment part subpart supplier name supplier address name (with name subpart name address name supplier name leaf values omitted) September 2000 supplier XML Workshop, IIT Bombay address

Queries in XML 1. SELECT X 2. FROM part. _*. supplier. name X 2.

Queries in XML 1. SELECT X 2. FROM part. _*. supplier. name X 2. Select X From part. _*. supplier: {name X, address: “Mumbai”} September 2000 XML Workshop, IIT Bombay

Indexes for XML n n Path indexes: regular path expressions Value Indexes: locating atomic

Indexes for XML n n Path indexes: regular path expressions Value Indexes: locating atomic objects September 2000 XML Workshop, IIT Bombay

Building A Path Index part subpart name supplier name address h 1 part name

Building A Path Index part subpart name supplier name address h 1 part name supplier h 6 name supplier h 4 name address h 5 September 2000 address subpart name supplier subpart h 3 name address supplier name address name h 2 name subpart XML Workshop, IIT Bombay name address h 7 address

Path Index h 1 h 2 part subpart name supplier subpart h 3 name

Path Index h 1 h 2 part subpart name supplier subpart h 3 name address name supplier h 6 name supplier h 4 name address h 5 • Index summarises path information • Each entry: list of pointers to data nodes September 2000 XML Workshop, IIT Bombay name address h 7

Using Path Index for Regular Path Expressions h 1 h 2 part subpart name

Using Path Index for Regular Path Expressions h 1 h 2 part subpart name supplier subpart h 3 name address name supplier h 6 name supplier h 4 name address h 5 (R 1) (R 2) (R 3) (R 4) part. name part. supplier. name _*. supplier. name part. _*. subpart. name September 2000 XML Workshop, IIT Bombay name address h 7

Path Indexes n n XSet project (Berkeley) Dataguides (Lore, Stanford) September 2000 XML Workshop,

Path Indexes n n XSet project (Berkeley) Dataguides (Lore, Stanford) September 2000 XML Workshop, IIT Bombay

Value Index n n Useful for comparisons (=, <, etc. ) Example: Find supplier

Value Index n n Useful for comparisons (=, <, etc. ) Example: Find supplier whose name is “XYZ”? part subpart VIndex(name) supplier name supplier address name “XYZ” “ABC” September 2000 XML Workshop, IIT Bombay address

Other Indexes n Text Indexes: Information retrieval style keyword search. Example: Find the suppliers

Other Indexes n Text Indexes: Information retrieval style keyword search. Example: Find the suppliers in Mumbai(“address”) Also supports search features like AND, OR, NEAR, etc. September 2000 XML Workshop, IIT Bombay

Conclusion n n Performance improves significantly when indexing is used for query processing (Lore).

Conclusion n n Performance improves significantly when indexing is used for query processing (Lore). Performance of the path indexes depends on the type of queries. September 2000 XML Workshop, IIT Bombay

References n The Lore Project (www-db. stanford. edu/lore) n Work done by Dan Suciu

References n The Lore Project (www-db. stanford. edu/lore) n Work done by Dan Suciu (www. research. att. com/~suciu/) n Data on the Web: Serge Abiteboul, et al. September 2000 XML Workshop, IIT Bombay

Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop,

Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop, IIT Bombay