Graphically Querying RDF Using RDFGL Frederik Hogenboom Viorel
Graphically Querying RDF Using RDF-GL Frederik Hogenboom Viorel Milea Flavius Frasincar Uzay Kaymak fhogenboom@ese. eur. nl milea@ese. eur. nl frasincar@ese. eur. nl kaymak@ese. eur. nl Erasmus University Rotterdam PO Box 1738, NL-3000 DR Rotterdam, the Netherlands This talk is based on the paper RDF-GL: A SPARQL-Based Graphical Query Language for RDF. Hogenboom, F. P. , Milea, D. V. , Frasincar, F. & Kaymak, U. (2010). In Y. Badr, A. Abraham, A. -E. Hassanien & R. Chbeir (Eds. ), Emergent Web Intelligence: Advanced Information Retrieval (Advanced Information and Knowledge Processing) (pp. 87 -116). Springer London. November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
2 Introduction • Increasing amount of information • Querying large databases quickly and efficiently is desired • This need is addressed by several tools and languages • However, intuitiveness is often of less importance (expressivity over clarity) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
3 Graphical Query Languages • Enable users to create queries by arranging and connecting symbols on a canvas • Build on existing textual query languages • Abstract away from difficult syntax and hence aim to enhance intuitiveness • Coverage of textual equivalents vs. maintaining usability • Available for semi-structured representations such as XML, but no SPARQL-based GQL for RDF • RDF-GL is a GQL for a SPARQL subset November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
4 RDF-GL • Covers most SPARQL SELECT queries • No support for: – Prologue: BASE and PREFIX – Query type: ASK, CONSTRUCT, DESCRIBE – REDUCED, FROM, NAMED, GRAPH • However, the design of the language allows for extension • Based on SPARQL and not on SPARQL 1. 1 November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
5 RDF-GL: Example (1) PREFIX j. 1: <http: //www. daml. org/2003/09/factbook-ont#> SELECT DISTINCT ? name ? oil WHERE { ? country j. 1: local. Short. Country. Name ? name. ? country j. 1: gross. Domestic. Product. Per. Capita ? gdp. { FILTER (? gdp < 1500). } UNION { FILTER (? gdp > 2500). } OPTIONAL { ? country j. 1: oil. Proved. Reserves ? oil. } } ORDER BY ASC(? gdp) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
6 RDF-GL: Example (2) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
7 RDF-GL: Basic Elements { , } November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
8 RDF-GL: Mapping to SPARQL (1) PREFIX j. 1: <http: //www. daml. org/2003/09/factbook-ont#> SELECT DISTINCT ? elevation WHERE { ? country j. 1: elevation ? elevation. } ORDER BY DESC(? elevation) OFFSET 5 LIMIT 4 Query type and sequence modifiers November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
9 RDF-GL: Mapping to SPARQL (2) FILTER (? gdp > 1250). Filters j. 1: Ethnic. Group rdfs: sub. Class. Of ? class. Triples November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
10 RDF-GL: Mapping to SPARQL (3) ? country j. 1: highways. Total ? hw. { FILTER (? hw < 20000). } UNION { FILTER (? hw > 150000). } Unions Options OPTIONAL { ? country j. 1: heliports ? heli. } November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
11 RDF-GL: Mapping to SPARQL (4) ? country j. 1: highways. Total ? hw. { FILTER (? hw < 20000). } UNION { FILTER (? hw > 150000). OPTIONAL { ? country j. 1: heliports ? heli. } } November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010) Nesting
12 RDF-GL: Converting to SPARQL • Generate default PREFIXes (RDF, RDFS, XSD, etc. ) • Determine query TYPE • Generate WHERE clause: – Convert all arrows and their connected boxes to triples – Add triples not connected to UNIONs or OPTIONALs to query – Add UNION and OPTIONAL circles and their children to query recursively • Determine ORDER BY • Determine LIMIT and OFFSET November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
13 Implementation: SPARQLin. G (1) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
14 Implementation: SPARQLin. G (2) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
15 Experiments (1) • Query the CIA World Factbook for countries which have an import or export to neighbors worth more than $10 billion a year • The query needs to return: – The names of both the countries and their neighboring trading partners – The percentages of imports and exports – Optionally the inflation rate of the neighboring partners • Only the first 20 results are desired and should be ordered by country name (ascending) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
16 Experiments (2) PREFIX j. 1: <http: //www. daml. org/2003/09/factbook-ont#> SELECT DISTINCT ? name. C ? name. N ? percent. Exp ? percent. Imp ? inflation WHERE { ? country j. 1: conventional. Short. Country. Name ? name. C. ? country j. 1: border ? border j. 1: country ? neighbor j. 1: conventional. Short. Country. Name ? name. N. ? country j. 1: export. Partner ? partner. Exp j. 1: percent ? percent. Exp. ? partner. Exp j. 1: country ? neighbor. ? country j. 1: import. Partner ? partner. Imp j. 1: percent ? percent. Imp. ? partner. Imp j. 1: country ? neighbor. { ? country j. 1: imports ? imports. FILTER (? imports > 100000). } UNION { ? country j. 1: exports ? exports. FILTER (? exports > 100000). } OPTIONAL { ? neighbor j. 1: inflation. Rate ? inflation. } } ORDER BY ASC(? name. C) LIMIT 20 November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
17 Experiments (3) November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
18 Experiments (4) • The creation of RDF-GL queries is not necessarily faster than the creation of SPARQL queries • However, the GQL and tool offer: – Overview – Easy reuse of variables – Maintenance November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
19 Order Dependency in SPARQL • OPTIONAL is not commutative, causing problems when sharing variables in OPTIONAL clause: SELECT ? name ? alias WHERE { ? x foaf: name ? name. OPTIONAL { ? x foaf: nick ? alias. } OPTIONAL { ? x foaf: mbox ? alias. } } SELECT ? name ? alias WHERE { ? x foaf: name ? name. OPTIONAL { ? x foaf: mbox ? alias. } OPTIONAL { ? x foaf: nick ? alias. } } • We need a binary OPTIONAL operator which is commutative and associative November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
20 Discussion • AND-relationship is implicit, OR-relationship is explicit • Issues: – No coverage for the complete SPARQL language – Determining statement order is difficult • Solutions: – Extend language by using: • Extra symbols • Extra colors/shadings • Extra shapes – Add numbers for ordering, make use of the horizontal or vertical position of elements, etc. November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
21 Conclusions • RDF-GL is a GQL for SPARQL • The GQL covers a subset of SPARQL (i. e. , almost all SELECT queries) • Future work: – Implement a way to determine query order – Extend the language to include more of SPARQL, i. e. , – Visualization: • Widget size indicating importance • Lay-out algorithm – SPARQL to RDF-GL conversion November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
22 Questions November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
23 Appendix: EBNF non-terminals (1) [1] [2] Query Select. Query [3] [4] [5] Where. Clause Solution. Modifier Limit. Offset. Clauses [6] [7] Order. Clause Order. Condition [8] Limit. Clause [9] Offset. Clause [10] Group. Graph. Pattern [11] Triples. Block [12] Graph. Pattern. Not. Triples November 22, 2010 : : = Select. Query : : = ‘SELECT’ ‘DISTINCT’? ( Var+ | ‘*’ ) Where. Clause Solution. Modifier : : = ‘WHERE’? Group. Graph. Pattern : : = Order. Clause? Limit. Offset. Clauses? : : = ( Limit. Clause Offset. Clause? | Offset. Clause Limit. Clause? ) : : = ‘ORDER’ ‘BY’ Order. Condition+ : : = ( ( ‘ASC’| ‘DESC’) Bracketted. Expression ) (Constraint | Var ) : : = ‘LIMIT’ INTEGER : : = ‘OFFSET’ INTEGER : : = ‘{’ Triples. Block? ( ( Graph. Pattern. Not. Triples | Filter ) ‘. ’? Triples. Block? )* ‘}’ : : = Triples. Same. Subject ( ‘. ’ Triples. Block? )? : : = Optional. Graph. Pattern | Group. Or. Union. Graph. Pattern Dutch-Belgian Database Day 2010 (DBDBD 2010)
24 Appendix: EBNF non-terminals (2) [13] Optional. Graph. Pattern : : = ‘OPTIONAL’ Group. Graph. Pattern [14] Group. Or. Union. Graph. Pattern : : = Group. Graph. Pattern ( ‘UNION’ Group. Graph. Pattern )* [15] Filter : : = ‘FILTER’ Constraint [16] Constraint : : = Bracketted. Expression | Built. In. Call | Function. Call [17] Function. Call : : = IRIref Arg. List [18] Arg. List : : = ( NIL ‘(’ Expression ( ‘, ’ Expression )* ‘)’ ) [19] Triples. Same. Subject : : = Var. Or. Term. Property. List. Not. Empty | Triples. Node Property. List [20] Property. List. Not. Empty : : = Verb Object. List ( ‘; ’ ( Verb Object. List )? ) [21] Property. List : : = Property. List. Not. Empty? [22] Object. List : : = Object ( ‘, ’ Object )* [23] Object : : = Graphmode [24] Verb : : = Var. Or. IRIref | ‘a’ [25] Triples. Node : : = Collection | Blank. Node. Property. List [26] Blank. Node. Property. List : : = ‘[’ Property. List. Not. Empty ‘]’ November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
25 Appendix: EBNF non-terminals (3) [27] [28] [29] [30] [31] [32] Collection Graph. Node Var. Or. Term Var. Or. IRIref Var Graph. Term : : = : : = [33] Expression [34] Conditional. Or. Expression : : = [35] Conditional. And. Expression : : = [36] Value. Logical : : = [37] Relational. Expression : : = November 22, 2010 ‘(’ Graph. Node+ ‘)’ Var. Or. Term | Triples. Node Var | Graph. Term Var | IRIref VAR 1 IRIref | RDFLiteral | Numeric. Literal | Boolean. Literal | Blank. Node | NIL Conditional. Or. Expression Conditional. And. Expression ( ‘||’ Conditional. And. Expression )* Value. Logical ( ‘&&’ Value. Logical )* Relational. Expression Numeric. Expression ( ‘=’ Numeric. Expression | ‘!=’ Numeric. Expression | ‘<’ Numeric. Expression | ‘>’ Numeric. Expression | ‘<=’ Numeric. Expression | ‘>=’ Numeric. Expression )? Dutch-Belgian Database Day 2010 (DBDBD 2010)
26 Appendix: EBNF non-terminals (4) [38] Numeric. Expression [39] Additive. Expression : : = Multiplicative. Expression ( ‘+’ Multiplicative. Expression | ‘-’ Multiplicative. Expression | Numeric. Literal. Positive | Numeric. Literal. Negative )* [40] Multiplicative. Expression : : = Unary. Expression ( ‘*’ Unary. Expression | ‘/’ Unary. Expression )* [41] Unary. Expression : : = ‘!’ Primary. Expression | ‘+’ Primary. Expression | ‘-’ Primary. Expression | Primary. Expression [42] Primary. Expression : : = Bracketted. Expression | Built. In. Call | IRIref. Or. Function | RDFLiteral | Numeric. Literal | Boolean. Literal | Var [43] Bracketted. Expression : : = ‘(’ Expression ‘)’ November 22, 2010 Dutch-Belgian Database Day 2010 (DBDBD 2010)
27 Appendix: EBNF non-terminals (5) [44] Built. In. Call [45] [46] [47] [48] Regex. Expression IRIref. Or. Function RDFLiteral Numeric. Literal [49] Numeric. Literal. Unsigned [50] Numeric. Literal. Positive November 22, 2010 : : = ‘STR’ ‘(’ Expression ‘)’ | ‘LANGMATCHES’ ‘(’ Expression ‘, ’ Expression ‘)’ | ‘DATATYPE’ ‘(’ Expression ‘)’ | ‘BOUND’ ‘(’ Var ‘)’ | ‘same. Term’ ‘(’ Expression ‘, ’ Expression ‘)’ | ‘is. IRI’ ‘(’ Expression ‘)’ | ‘is. URI’ ‘(’ Expression ‘)’ | ‘is. BLANK’ ‘(’ Expression ‘)’ | ‘is. LITERAL’ ‘(’ Expression ‘)’ | Regex. Expression : : = IRIref Arg. List? : : = String ( LANGTAG | ( ‘^^’ IRIref ) )? : : = Numeric. Literal. Unsigned | Numeric. Literal. Positive | Numeric. Literal. Negative : : = INTEGER | DECIMAL | DOUBLE : : = INTEGER_POSITIVE | DECIMAL_POSITIVE | DOUBLE_POSITIVE Dutch-Belgian Database Day 2010 (DBDBD 2010)
28 Appendix: EBNF non-terminals (6) [51] Numeric. Literal. Negative [52] Boolean. Literal [53] String [54] IRIref [55] Prefixed. Name [56] Blank. Node November 22, 2010 : : = INTEGER_NEGATIVE | DECIMAL_NEGATIVE | DOUBLE_NEGATIVE : : = ‘true’ | ‘false’ : : = STRING_LITERAL 1 | STRING_LITERAL 2 | STRING_LITERAL_LONG 1 | STRING_LITERAL_LONG 2 : : = IRI_REF | Prefixed. Name : : = PNAME_LN | PNAME_NS : : = BLANK_NODE_LABEL | ANON Dutch-Belgian Database Day 2010 (DBDBD 2010)
29 Appendix: EBNF terminals (1) [57] [58] [59] [60] [61] [62] [63] [64] [65] IRI_REF PNAME_NS PNAME_LN BLANK_NODE_LABEL VAR 1 LANGTAG INTEGER DECIMAL DOUBLE : : = : : = : : = [66] [67] [68] [69] [70] [71] [72] [73] INTEGER_POSITIVE DECIMAL_POSITIVE DOUBLE_POSITIVE INTEGER_NEGATIVE DECIMAL_NEGATIVE DOUBLE_NEGATIVE EXPONENT STRING_LITERAL 1 : : = : : = November 22, 2010 ‘<’ ( [^<>"{}|^`]-[#x 00 -#x 20])* ‘>’ PN_PREFIX? ‘: ’ PNAME_NS PN_LOCAL ‘_: ’ PN_LOCAL ‘? ’ VARNAME ‘@’ [a-z. A-Z]+ ( ‘-’ [a-z. A-Z 0 -9]+ )* [0 -9]+ ‘. ’ [0 -9]* | ‘. ’ [0 -9]+ ‘. ’ [0 -9]* EXPONENT | ‘. ’ ([0 -9])+ EXPONENT | ([0 -9])+ EXPONENT ‘+’ INTEGER ‘+’ DECIMAL ‘+’ DOUBLE ‘-’ INTEGER ‘-’ DECIMAL ‘-’ DOUBLE [e. E] [+-]? [0 -9]+ ‘ ( ( [^#x 27#x 5 C#x. A#x. D] ) | ECHAR )* "’" Dutch-Belgian Database Day 2010 (DBDBD 2010)
30 Appendix: EBNF terminals (2) [74] STRING_LITERAL 2 [75] STRING_LITERAL_LONG 1 [76] STRING_LITERAL_LONG 2 [77] [78] [79] [80] [81] ECHAR NIL WS ANON PN_CHARS_BASE [82] PN_CHARS_U November 22, 2010 : : = ‘"’ ( ( [^#x 22#x 5 C#x. A#x. D] ) | ECHAR )* ‘"’ : : = ’’’ ( ( "’" | "’’" )? ( [^’] | ECHAR ) )* "’’’" : : = ‘"""’ ( ( ‘"’ | ‘""’ )? ( [^"] | ECHAR ) )* ‘"""’ : : = ‘’ [tbnrf"’] : : = ‘(’ WS* ‘)’ : : = #x 20 | #x 9 | #x. D | #x. A : : = ‘[’ WS* ‘]’ : : = [A-Z] | [a-z] | [#x 00 C 0 -#x 00 D 6] | [#x 00 D 8 -#x 00 F 6] | [#x 00 F 8 -#x 02 FF] | [#x 0370 -#x 037 D] | [#x 037 F-#x 1 FFF] | [#x 200 C-#x 200 D] | [#x 2070 -#x 218 F] | [#x 2 C 00 -#x 2 FEF] | [#x 3001 -#x. D 7 FF] | [#x. F 900 -#x. FDCF] | [#x. FDF 0 -#x. FFFD] | [#x 10000 -#x. EFFFF] : : = PN_CHARS_BASE | ‘_’ Dutch-Belgian Database Day 2010 (DBDBD 2010)
31 Appendix: EBNF terminals (3) [83] VARNAME [84] PN_CHARS [85] PN_PREFIX [86] PN_LOCAL November 22, 2010 : : = ( PN_CHARS_U | [0 -9] ) ( PN_CHARS_U | [0 -9] | #x 00 B 7 | [#x 0300 -#x 036 F] | [#x 203 F-#x 2040] )* : : = PN_CHARS_U | ‘-’ | [0 -9] | #x 00 B 7 | [#x 0300 -#x 036 F] | [#x 203 F-#x 2040] : : = PN_CHARS_BASE ( ( PN_CHARS | ‘. ’ )* PN_CHARS )? : : = ( PN_CHARS_U | [0 -9] ) ( ( PN_CHARS | ‘. ’ )* PN_CHARS )? Dutch-Belgian Database Day 2010 (DBDBD 2010)
- Slides: 31