Indexing of XML Data Raghuraman Rangarajan KRe SIT
- Slides: 16
Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop, IIT Bombay
Plan of Talk n n n Why is indexing needed? Queries and Indexes in Traditional DBMS Querying in XML Indexes: Path, Value Conclusion September 2000 XML Workshop, IIT Bombay
Why is Indexing Needed? n n Allows fast access to data by replicating portions of the data in special purpose structures. Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries. September 2000 XML Workshop, IIT Bombay
Queries and Indexes in Traditional DBMS Databases Query Example SELECT name Relational Associative FROM account WHERE acct. No =14 OO September 2000 Path Expressions SELECT X. name FROM dept. empl X XML Workshop, IIT Bombay
An XML Fragment part subpart supplier name supplier address name (with name subpart name address name supplier name leaf values omitted) September 2000 supplier XML Workshop, IIT Bombay address
Queries in XML 1. SELECT X 2. FROM part. _*. supplier. name X 2. Select X From part. _*. supplier: {name X, address: “Mumbai”} September 2000 XML Workshop, IIT Bombay
Indexes for XML n n Path indexes: regular path expressions Value Indexes: locating atomic objects September 2000 XML Workshop, IIT Bombay
Building A Path Index part subpart name supplier name address h 1 part name supplier h 6 name supplier h 4 name address h 5 September 2000 address subpart name supplier subpart h 3 name address supplier name address name h 2 name subpart XML Workshop, IIT Bombay name address h 7 address
Path Index h 1 h 2 part subpart name supplier subpart h 3 name address name supplier h 6 name supplier h 4 name address h 5 • Index summarises path information • Each entry: list of pointers to data nodes September 2000 XML Workshop, IIT Bombay name address h 7
Using Path Index for Regular Path Expressions h 1 h 2 part subpart name supplier subpart h 3 name address name supplier h 6 name supplier h 4 name address h 5 (R 1) (R 2) (R 3) (R 4) part. name part. supplier. name _*. supplier. name part. _*. subpart. name September 2000 XML Workshop, IIT Bombay name address h 7
Path Indexes n n XSet project (Berkeley) Dataguides (Lore, Stanford) September 2000 XML Workshop, IIT Bombay
Value Index n n Useful for comparisons (=, <, etc. ) Example: Find supplier whose name is “XYZ”? part subpart VIndex(name) supplier name supplier address name “XYZ” “ABC” September 2000 XML Workshop, IIT Bombay address
Other Indexes n Text Indexes: Information retrieval style keyword search. Example: Find the suppliers in Mumbai(“address”) Also supports search features like AND, OR, NEAR, etc. September 2000 XML Workshop, IIT Bombay
Conclusion n n Performance improves significantly when indexing is used for query processing (Lore). Performance of the path indexes depends on the type of queries. September 2000 XML Workshop, IIT Bombay
References n The Lore Project (www-db. stanford. edu/lore) n Work done by Dan Suciu (www. research. att. com/~suciu/) n Data on the Web: Serge Abiteboul, et al. September 2000 XML Workshop, IIT Bombay
Indexing of XML Data Raghuraman Rangarajan KRe. SIT, IIT Bombay. September 2000 XML Workshop, IIT Bombay