GLASS A Graphical Query Language for SemiStructured Data

  • Slides: 44
Download presentation
GLASS: A Graphical Query Language for Semi-Structured Data Wei Ni Tok Wang Ling Department

GLASS: A Graphical Query Language for Semi-Structured Data Wei Ni Tok Wang Ling Department of Computer Science National University of Singapore, Singapore E-mail: {niwei, lingtw}@comp. nus. edu. sg 2003 -3 -28 DASFAA 2003, Kyoto, Japan 1

Roadmap 1. 2. 3. 4. 5. Introduction and motivation ORA-SS the data model of

Roadmap 1. 2. 3. 4. 5. Introduction and motivation ORA-SS the data model of our language GLASS, our graphical query language Related works and comparison Conclusion and future work 2003 -3 -28 DASFAA 2003, Kyoto, Japan 2

1. Introduction and motivation § XML: a standard for representation, manipulation and exchange of

1. Introduction and motivation § XML: a standard for representation, manipulation and exchange of data. § Query engines: a crucial application to exploit the full power of XML. § XQuery: a standard for querying XML data which is too difficult for non-technical users to use. § Graphical query language: a way to help users query XML data. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 3

1. Introduction and motivation (Cont. ) § Criteria of a good query language Expressiveness

1. Introduction and motivation (Cont. ) § Criteria of a good query language Expressiveness § Completeness § User-friendliness § § Graphical Query Language: user-friendly and easy understanding 2003 -3 -28 DASFAA 2003, Kyoto, Japan 4

2. ORA-SS the data model of our language § ORA-SS (Object-Relationship-Attribute model for Semi.

2. ORA-SS the data model of our language § ORA-SS (Object-Relationship-Attribute model for Semi. Structured data) : a rich semantic data model. There is a binary relationship type, named as “cs”, between course and student where one course may has one or many students and one students can take one or many courses. Example 1: department <!ELEMENT department (course+)> <!ATTLIST department name ID #REQUIRED> <!ELEMENT course (title? , student+)> <!ATTLIST course code ID #REQUIRED> <!ELEMENT title PCDATA> <!ELEMENT student (name? , grade+)> <!ATTLIST student number #IMPLIED> <!ELEMENT name PCDATA> <!ELEMENT grade PCDATA> 2, 1: n, 1: 1 course name cs, 2, 1: n student code title number The DTD of “Department. xml” 2003 -3 -28 <grade> belongs to the binary relationship type “cs” between <course> and <student> cs name grade The ORA-SS schema diagram of “Department. xml” DASFAA 2003, Kyoto, Japan 5

3. GLASS our graphical query language § GLASS (Graphical Query Language for Semi-Structured Data)

3. GLASS our graphical query language § GLASS (Graphical Query Language for Semi-Structured Data) § Targets of GLASS support various queries including Aggregation Functions and Negation; § be clear and concise without ambiguity; § provide “freedom” to non-technical users in querying. § 2003 -3 -28 DASFAA 2003, Kyoto, Japan 6

3. GLASS our graphical query language 3. 1 General concepts in GLASS 1) Data

3. GLASS our graphical query language 3. 1 General concepts in GLASS 1) Data icons Rectangle: the object class in ORA-SS, non-terminal element in XML (element with subelements or attributes). ii. Circle: the attribute in ORA-SS, terminal element with PCDATA only and the attribute (or attribute list) in XML. i. 2) Connections Arrow: relationship type ii. Dashed arrow: IDREF in XML iii. Line: link between output and original entities i. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 7

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. )

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. ) 3) Box: the group entity in GLASS query graphs that consists of all rectangles or circles inside the box. 4) Derived entities: derived object classes and attributes are represented as dashed rectangles and dashed circles which are new data types defined by users. 5) Condition Logic Window (CLW): an optional part in a GLASS query to write logic expressions and statements (e. g. IF-THEN) for complex query conditions and/or constructions. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 8

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. )

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. ) 6) Path Identifier a unique name given to data icons or boxes; indicated by prefix “$”. § § 7) Condition Identifier a unique name given to the connections; without prefix “$”. § § 8) Logic Expression: specifies the logic in query conditions 9) Statement: helps construct complex outputs 2003 -3 -28 DASFAA 2003, Kyoto, Japan 9

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. )

3. GLASS our graphical query language 3. 1 General concepts in GLASS (Cont. ) Structure of GLASS query graph Query condition Result construction CLW Logic expressions and statements The black parts are optional. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 10

3. GLASS our graphical query language 3. 2 Output construction Example 1: <!ELEMENT department

3. GLASS our graphical query language 3. 2 Output construction Example 1: <!ELEMENT department (course+)> <!ATTLIST department name ID #REQUIRED> <!ELEMENT course (title? , student+)> <!ATTLIST course code ID #REQUIRED> <!ELEMENT title PCDATA> <!ELEMENT student (name? , grade+)> <!ATTLIST student number #IMPLIED> <!ELEMENT name PCDATA> <!ELEMENT grade PCDATA> The DTD of “Department. xml” department 2, 1: n, 1: 1 course name cs, 2, 1: n student code title number cs name <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> grade The ORA-SS schema diagram of “Department. xml” 2003 -3 -28 The content of “Department. xml” DASFAA 2003, Kyoto, Japan 11

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2 Output construction (Cont. ) (a) Extract courses and all information one level under course elements by using the default output method QUERY RESULT course The default entity type implies both attributes and simple subelements of course element <course code=“ 201”> <title>Software Engineering</title> </course> <course code=“ 303”> <title>Database Design</title> </course> The default number of level is “ 1”. We use “/2” to represent 2 levels under element course. 2003 -3 -28 In the default output method, everything will be kept in the original style from source data. DASFAA 2003, Kyoto, Japan course name cs, 2, 1: n student code title number cs name grade SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 12

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2 Output construction (Cont. ) (b) Extract courses and all information at all levels under course elements by using the default output method QUERY RESULT course * “*” is a wildcard which means all nested level under course element. 2003 -3 -28 <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> DASFAA 2003, Kyoto, Japan course name cs, 2, 1: n student code title number cs name grade SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 13

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2 Output construction (Cont. ) (c) Extract courses with all attributes of course element in the original XML document QUERY RESULT course <course code=“ 201”></course> <course code=“ 303”></course> @ The circle with “@” inside implies all attributes of course element in the original XML document 2003 -3 -28 DASFAA 2003, Kyoto, Japan course name cs, 2, 1: n student code title number cs name grade SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 14

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2 Output construction (Cont. ) (d) Extract courses and all subelements at one level under course elements (except the attributes of course elements) QUERY RESULT course <course> <title>Software Engineering</title> </course> <title>Database Design</title> </course> e The circle with “e” inside indicates all simple subelements of course element. 2003 -3 -28 DASFAA 2003, Kyoto, Japan course name cs, 2, 1: n student code title number cs name grade SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 15

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2

department 3. GLASS our graphical query language 2, 1: n, 1: 1 3. 2 Output construction (Cont. ) (e) Extract courses with their attributes and/or simple subelements as well as the contents of complex subelements at one level under course element. QUERY RESULT course The blank circle implies both attributes and simple subelements of course element The blank rectangle stands for all complex subelements of course element. 2003 -3 -28 <course code=“ 201”> <title>Software Engineering</title> <student></student> </course> <course code=“ 303”> <title>Database Design</title> <student></student> </course> Since only the entities at one level under course element are displayed, the student number at the second level under course doesn’t appear in the result. DASFAA 2003, Kyoto, Japan course name cs, 2, 1: n student code title number cs name grade SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 16

3. GLASS our graphical query language 3. 2 Output construction (Cont. ) (f) Extract

3. GLASS our graphical query language 3. 2 Output construction (Cont. ) (f) Extract courses with their titles as attributes and codes as subelements in the output. QUERY RESULT course @ title <course title=“Software Engineering”> <code>201</code> </course> <course title=“Database Design”> <code>303</code> </course> title e code The links here imply that the values of the new defined title attribute and code element come from the title and code in the original data. 2003 -3 -28 a demonstration of the conversion between attribute and subelement DASFAA 2003, Kyoto, Japan SOURCE DATA <? xml version = “ 1. 0” standalone = “no” encoding = “UTF-8”> <DOCTYPE BOOK SYSTEM “Department. dtd”> <department name=“CS”> <course code=“ 201”> <title>Software Engineering</title> <student number=“ 1001”> <name>John Smith</name> <grade>A</grade> </student> <student number=“ 1002”> <name>Mel Green</name> <grade>C</grade> </student> </course> <course code=“ 303”> <title>Database Design</title> <student number=“ 1001”> <name>John Smith</name> <grade>B</grade> </student> </course> </department> 17

3. GLASS our graphical query language 3. 3 Basic query operators Selection To select

3. GLASS our graphical query language 3. 3 Basic query operators Selection To select all courses whose codes begin with “ 2”, we can write an XQuery language as: course * FOR $c IN $department/course WHERE $c/@code/data() = ‘ 2%’ RETURN $c 2003 -3 -28 code = ‘ 2%’ The output course in RHS is the course that satisfies the condition in LHS DASFAA 2003, Kyoto, Japan 18

3. GLASS our graphical query language 3. 4 Basic query operators Projection To project

3. GLASS our graphical query language 3. 4 Basic query operators Projection To project all courses with their titles, XQuery will give the following expression: course FOR $c IN $department/course RETURN <course>{ $c/title }</course> title Without the specification of subelements or attributes, the title will remain unchanged as a subelement of course in the result. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 19

3. GLASS our graphical query language 3. 5 Basic query operators Join Suppose we

3. GLASS our graphical query language 3. 5 Basic query operators Join Suppose we have another XML data named as “Description. xml” containing the descriptions of all courses. course <!ELEMENT course (title? , description)> <!ATTLIST course code ID #REQUIRED> <!ELEMENT title PCDATA> <!ELEMENT description PCDATA> code The DTD of “Description. xml” 2003 -3 -28 title description The ORA-SS schema diagram of “Description. xml” DASFAA 2003, Kyoto, Japan 20

3. GLASS our graphical query language 3. 5 Basic query operators Join (Cont. )

3. GLASS our graphical query language 3. 5 Basic query operators Join (Cont. ) Query: To extract everything of the courses from “Department. xml”, which have descriptions in “Description. xml”, and put the corresponding descriptions under the courses in the results. FROM Department. xml FROM Description. xml course * The URLs of data sources code 2003 -3 -28 description DASFAA 2003, Kyoto, Japan description 21

3. GLASS our graphical query language 3. 6 Aggregation functions group & average Query:

3. GLASS our graphical query language 3. 6 Aggregation functions group & average Query: For all courses, display courses with their information (in default way) and the average grade of each course avg_grade is a derived entity and it will be constructed as a subelement of “course” course _group To group student under course student e avg_grade cs AVG The character “e” inside the dotted circle implies the avg_grade is a element type grade Aggregation functions are available after “_group” functionality is done. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 22

3. GLASS our graphical query language 3. 6 Aggregation functions group & average Query:

3. GLASS our graphical query language 3. 6 Aggregation functions group & average Query: For all courses, display courses with their information (in default way) and the average grade of each course. (Presented by XQuery expressions as follows) for $ccd in distinct-values(document("Department. xml")//course/@code) let $c : = document("Department. xml")//course, $s : = $c/student where $c/@code = $ccd return <course code = "{$c/@code}" > { $c/title, } <avg_grade> {avg($s/grade)} </avg_grade> </course> 2003 -3 -28 DASFAA 2003, Kyoto, Japan 23

3. GLASS our graphical query language 3. 7 Query order sensitive data Suppose we

3. GLASS our graphical query language 3. 7 Query order sensitive data Suppose we have an order sensitive data “Bib. xml”. The <author> order is important to the <book> Example 2: book 2, +, +, < author isbn title content firstname lastname The ORA-SS schema diagram of “Bib. xml”, order-sensitive data 2003 -3 -28 DASFAA 2003, Kyoto, Japan 24

3. GLASS our graphical query language 3. 7 Query order sensitive data (Cont. )

3. GLASS our graphical query language 3. 7 Query order sensitive data (Cont. ) Query: Display all books with their isbn’s, titles and their first authors. book [1] author isbn 2003 -3 -28 “[1]” means the first element in order title DASFAA 2003, Kyoto, Japan 25

3. GLASS our graphical query language 3. 8 Advanced features of GLASS Suppose we

3. GLASS our graphical query language 3. 8 Advanced features of GLASS Suppose we have a more complex XML data about projects, members and publications. Ternary relation among <project>, <member> and <publication> where one member in one project can have 0 or many publications; and one publication can belong to one or many (project, member) pairs. Example 3: project 2, +, + member id name 3, *, + publication name job_title number title The ORA-SS schema diagram of the data about “project, member and publication” 2003 -3 -28 DASFAA 2003, Kyoto, Japan 26

3. GLASS our graphical query language 3. 8. 1 Group entity the use of

3. GLASS our graphical query language 3. 8. 1 Group entity the use of box Query: Display the members with names who have taken part in less than 5 projects but written more than 6 publications in some project they attended, and their names begin with the character “S”. The box member _group project name = ‘S%’ To group publications under each pair of (member, project) 2003 -3 -28 name CNT < 5 _group To group projects under member publication CNT > 6 “CNT” means “count”. (“CNT” do count without eliminating duplicates. ) We can also use other aggregation functions like SUM, AVG, etc. DASFAA 2003, Kyoto, Japan 27

3. GLASS our graphical query language 3. 8. 1 Group entity the use of

3. GLASS our graphical query language 3. 8. 1 Group entity the use of box (Cont. ) member _group project name = ‘S%’ _group publication CNT > 6 CNT < 5 _group publication Select publication GROUP BY <member, project> pairs The previous query: Display the members with names who have taken part in less than 5 projects but written more than 6 publications in some project they attended, and their names begin with the character “S”. 2003 -3 -28 member _group CNT < 5 name = ‘S%’ member name Select publication only GROUP BY <member> CNT_UNIQUE > 6 CNT_UNIQUE will eliminate duplicates in counting. A new query: Display the members with names who have totally taken part in less than 5 projects and totally written more than 6 publications, and their names begin with character “S”. DASFAA 2003, Kyoto, Japan 28

3. GLASS our graphical query language 3. 8. 2 Negation the use of Condition

3. GLASS our graphical query language 3. 8. 2 Negation the use of Condition Logic Window QUERY: display the name of those members who haven’t written the publication titled “Introduction to XML”. The colons are not part of the identifiers but distinguish the identifiers from relationship type names. member : A: publication name The identifier “A” means has a publication title = “Introduction to XML” Logic expression means “does not have A” or “does not exist A” 2003 -3 -28 with title = “Introduction to XML” CLW ¬ A; DASFAA 2003, Kyoto, Japan 29

3. GLASS our graphical query language 3. 8. 3 IF-THEN Statement Query: Display the

3. GLASS our graphical query language 3. 8. 3 IF-THEN Statement Query: Display the members with their names who have written a publication titled “Introduction to XML” or “Introduction to Internet”; and for those members who have written “Introduction to XML”, also display all information about the projects that they have taken part in. : A: : B: publication Path identifier “$pro” represents the project element with its the attributes and/or simple subelements member project : $pro : name title = “Introduction to XML” title = “Introduction to Internet” CLW A B; {IF (A) THEN EXTRACT $pro; } 2003 -3 -28 DASFAA 2003, Kyoto, Japan Without the statements in CLW, the information of the projects that all satisfied members have taken part in will be displayed. With the statement in CLW, the information of the projects will be extracted only when condition A is satisfied. 30

4. Related works and comparison § Form-based graphical XML query languages have a limited

4. Related works and comparison § Form-based graphical XML query languages have a limited expression power. Graphical XML Query Language [15] § XMLApe Query Language [17] § § XML-GL[4, 5] is not strong in expressing query logic and may cause misunderstanding in some cases. Ref[4] S. Ceri, S. Comai, E. Damiani, P, Fraternali, S. Paraboschi, and L. Tanca. XML-GL: a graphical language of querying and restructuring XML documents. In Proc. WWW 8, Toronto, Canada, May 1999 Ref[5] S. Ceri, S. Comai, E. Damiani, P. Fraternali, and L. Tanca. Complex Queries in XML-GL. SAC (2) 2000: 888 -893 Ref[15] Leo Mark, etc. XMLApe. College of Computing, Georgia Institue of Technology. http: //www. cc. gatech. edu/projects/XMLApe/ Ref[17] Jan Paredaens, Peter Peelman, Letizia Tanca. G-Log: A Graph-Based Query Language. IEEE Transactions on Knowledge and Data Engineering, 7(3): 436 --453, June 1995. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 31

4. Related works and comparison Features of XML-GL § XML-GL is the world’s first

4. Related works and comparison Features of XML-GL § XML-GL is the world’s first visual language that explicitly addresses the full complexity of querying XML data. § Has a bipartite structure to express querying and restructuring. § Supports various XML queries § § § 2003 -3 -28 Selection and projection Join from one or more input documents; Construction of new documents and new elements Arithmetic and aggregate functions Union, difference, heterogeneous union, and Cartesian product. DASFAA 2003, Kyoto, Japan 32

4. Related works and comparison An example of XML-GL The second interpretation: The first

4. Related works and comparison An example of XML-GL The second interpretation: The first interpretation: For all <manufacturer> elements where For all <manufacturer> elements rank <= 10, we output MN-NAME, YEAR and <model> where rank <= 10, we output MN-NAME, YEAR and <model> subelement with MO-NAME and RANK; and for the remaining, we only output MN-NAME and YEAR. (No <manufacturer> will appear twice in the result; and all <manufacturer>s are output together in original order. ) 2003 -3 -28 and for all <manufacturer> elements we output MN-NAME and YEAR again but do not output <model> subelement. (Some <manufacturer>s will appear twice in the result. ) DASFAA 2003, Kyoto, Japan 33

4. Related works and comparison How we express both (the 1 st interpretation) MANUFACTURER

4. Related works and comparison How we express both (the 1 st interpretation) MANUFACTURER MODEL RANK <= 10 Without any links with the entities in the LHS, all <MANUFACTURE>s are directly extracted from the data. MANUFACTURER MNNAME MODEL YEAR MONAME RANK With this link between two <MODEL>s, only the <MODEL>s with RANK <= 10 appear in the results. For all <manufacturer> elements where rank <= 10, we output MN-NAME, YEAR and <model> subelement with MO-NAME and RANK; and for the remaining, we only output MN-NAME and YEAR. . (No <manufacturer> will appear twice in the output; and all <manufacturer>s are output together in original order) 2003 -3 -28 DASFAA 2003, Kyoto, Japan 34

4. Related works and comparison How we express both (the 2 nd interpretation) The

4. Related works and comparison How we express both (the 2 nd interpretation) The link between the two MANUFACTURERs implies only the MANUFACTURER that satisfies the condition in the LHS will be displayed. MANUFACTURER MODEL MANUFACTURER MNNAME MODEL YEAR RANK <= 10 MANUFACTURER MONAME With this link, only <model>s with rank <=10 are kept in the outputs RANK MNNAME YEAR In comparison, the MANUFACTURER without any links with the LHS means all MANUFACTURER elements from the source data. For all <manufacturer> elements where rank <= 10, we output MN-NAME, YEAR and <model> subelement with MO-NAME and RANK; and for all <manufacturer> elements we output MN-NAME and YEAR again but do not output <model> subelement. (The results of both queries are put separately in one file. ) 2003 -3 -28 DASFAA 2003, Kyoto, Japan 35

4. Related works and comparison Comparison XM Graphical XML Query Language L-GL Data Model

4. Related works and comparison Comparison XM Graphical XML Query Language L-GL Data Model Support Selection, Projection and Join XML DTD XMLApe Query Language S XML Schema ORA-SS XML DTD Ye Yes GLAS Yes s Support Queries on Ordered Data Yes No No Yes Support “group by” operator Yes No No Yes Support Aggregation Function Ye s N o Support Negation No No No Yes Support Complex Condition Yes No No Yes 2003 -3 -28 DASFAA 2003, Kyoto, Japan Yes 36

4. Related works and comparison Comparison (Cont. ) L-GL XM Graphical XML Query Language

4. Related works and comparison Comparison (Cont. ) L-GL XM Graphical XML Query Language XMLApe Query Language GLAS S Support XPath Yes No No Yes Support Qualifiers( , ) No No No Yes Support User-defined View Yes No No Yes Support View Validation No No No Yes Support Conditional Output Construction (e. g. With IF-THEN Clause) 2003 -3 -28 DASFAA 2003, Kyoto, Japan 37

5. Conclusion § Query logic can be explicitly expressed in CLW. § In comparison

5. Conclusion § Query logic can be explicitly expressed in CLW. § In comparison with XML-GL, GLASS can express negation and logic among conditions easily and clearly. § By separating logic expressions from graphical data structures, we now can build a clear and concise graphical query language. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 38

5. Conclusion (Cont. ) § We tend to express the complexity of XML queries

5. Conclusion (Cont. ) § We tend to express the complexity of XML queries in XQuery standard including Aggregation functions and Negation. § We support view transformation and validation, especially swapping elements in different levels. [6] Ref[6]: Yabing Chen, Tok Wang Ling, Mong Li Lee: Designing Valid XML Views. To appear in the proceedings of 21 st International Conference on Conceptual Modeling (ER'2002), October 7 -11, 2002, Tampere, Finland. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 39

Future work § Enhance the language; § Interpret the graphical expression into XQuery standard;

Future work § Enhance the language; § Interpret the graphical expression into XQuery standard; § Expand the content of GLASS data manipulation (e. g. , INSERT, etc) § data integration § view definition § view maintenance § 2003 -3 -28 DASFAA 2003, Kyoto, Japan 40

References [1] S. Abiteboul, D. Quass, J. Mc. Hugh, J. Widom and J. Wiener.

References [1] S. Abiteboul, D. Quass, J. Mc. Hugh, J. Widom and J. Wiener. The Lorel Query Language for Semistructured Data. Department of Computer Science, Stanford University. International Journal on Digital Libraries, 1(1): 68 -88, Apr. 1997. [2] M. Angelaccio, T. Catarci, and G. Santucci. QDB*: A graphical query language with recursion. IEEE Transactions on Software Engineering, 16(10): 1150 -1163, 1990. [3] T. Catarci, S. K. Chang, M. F. Costabile, S. Levialdi, and G. Santucci. A graph-based framework for multiparadigmatic visual access to databases. IEEE Transactions on Knowledge and Data Engineering, 8(3): 455 -475, 1996. [4] S. Ceri, S. Comai, E. Damiani, P, Fraternali, S. Paraboschi, and L. Tanca. XML-GL: a graphical language of querying and restructuring XML documents. In Proc. WWW 8, Toronto, Canada, May 1999 [5] S. Ceri, S. Comai, E. Damiani, P. Fraternali, and L. Tanca. Complex Queries in XML-GL. SAC (2) 2000: 888 -893 [6] Yabing Chen, Tok Wang Ling, Mong Li Lee: Designing Valid XML Views. To appear in the proceedings of 21 st International Conference on Conceptual Modeling (ER'2002), October 7 -11, 2002, Tampere, Finland. [7] Zhuo Chen. Extracting Schema from XML Documents. So. C, NUS. Honours Year Project Report. [8] Sara Comai, Ernesto Damiani, Letizia Tanca. The WG-Log System: Data Model and Semantics. INTERDATA technical report, T 2 -R 06, July 1998. [9] C. J. Date. An Introduction to Database Systems. 3 rd Edition, Addison-Wesley Publishing Company, 1981. [10] Gillian Dobbie, Wu Xiaoying, Tok Wang Ling, Mong Li Lee: ORA-SS: An Object-Relationship-Attribute Model for Semistructured Data. TR 21/00, Technical Report, Department of Computer Science, National University of Singapore, December 2000. 2003 -3 -28 DASFAA 2003, Kyoto, Japan 41

References (Cont. ) [11] P. W. Eklund, J. Leane, and C. Nowak. Gr. IT:

References (Cont. ) [11] P. W. Eklund, J. Leane, and C. Nowak. Gr. IT: An implementation of a graphical user interface for conceptual structures. Technical Report TR 94 -03, Computer Science Department, The University of Adelaide, February 1994. [12] Extensible Stylesheet Language (XSL) Specification. W 3 C Working Draft. Apr 1999. http: //www. w 3. org/TR/1999/WD-xsl-19990421/ [13] Ankur Gupta, Zahid Khan. Graphical XML Query Language. Course paper. College of Computing, Georgia Institute of Technology, Sep 2000 [14] Joshua S. Hodas, Robert M. Keller, Ingo Muschenets, Jeffrey Polakow, Amy R. Ward and Will Ballard. Condor: A Simple, Expressive Graphical Database Query Language. Department of Computer Science, Harvey Mudd College. Computer Science Technical Report HMC-CS-97 -04. [15] Leo Mark, etc. XMLApe. College of Computing, Georgia Institue of Technology. http: //www. cc. gatech. edu/projects/XMLApe/ [16] Yuanying Mo, Tok Wang Ling. Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database. Research Report. So. C, NUS. [17] Jan Paredaens, Peter Peelman, Letizia Tanca. G-Log: A Graph-Based Query Language. IEEE Transactions on Knowledge and Data Engineering, 7(3): 436 --453, June 1995. [18] Jayavel Shanmugasundaram, Kristin Tufte, Gang He, Chun Zhang, David De. Witt and Jeffrey Naughton. Relational Databases for Querying XML Documents: Limitations and Opportunities. VLDB 1999: 302 -314 Department of Computer Sciences, University of Wisconsin-Madison. [19] XML Path Language (XPath) 2. 0. W 3 C. Apr 2002. http: //www. w 3. org/TR/xpath 20/ 2003 -3 -28 DASFAA 2003, Kyoto, Japan 42

References (Cont. ) [20] XML Query Requirements. W 3 C. Feb 2001. http: //www.

References (Cont. ) [20] XML Query Requirements. W 3 C. Feb 2001. http: //www. w 3. org/TR/xmlquery-req [21] XML Syntax for XQuery 1. 0 (XQuery. X). W 3 C. Jun 2001. http: //www. w 3. org/TR/xqueryx [22] XQuery 1. 0 and XPath 2. 0 Data Model. W 3 C. Apr 2002. http: //www. w 3. org/TR/query-datamodel/ [23] XQuery 1. 0 and XPath 2. 0 Functions and Operators Version 1. 0. W 3 C. Apr 2002. http: //www. w 3. org/TR/xqueryoperators/ [24] XQuery 1. 0 Formal Semantics. W 3 C. Mar 2002. http: //www. w 3. org/TR/query-semantics/ 2003 -3 -28 DASFAA 2003, Kyoto, Japan 43

The End Thank you very much 2003 -3 -28 DASFAA 2003, Kyoto, Japan 44

The End Thank you very much 2003 -3 -28 DASFAA 2003, Kyoto, Japan 44