EventConditionAction Rule Languages over Semistructured Data George Papamarkos

  • Slides: 46
Download presentation
Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos 13/10/2006

Event-Condition-Action Rule Languages over Semistructured Data George Papamarkos 13/10/2006

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them?

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them? ü ECA Rules for XML ü ECA Langugage ü System Architecture ü Performance ü ECA Rules for RDF ü ECA Langugage ü System Architecture ü Performance 13/10/2006 2

What is an ECA Rule? ü An Event-Condition-Action rule performs actions in response to

What is an ECA Rule? ü An Event-Condition-Action rule performs actions in response to events, given that a stated condition holds ü An event in a database system can be the insertion of a new tuple ü The condition can be a query ü The action may be a relational table update ü This behaviour is called reactive functionality 13/10/2006 3

What is an ECA Rule? ü An ECA rule has the general syntax: on

What is an ECA Rule? ü An ECA rule has the general syntax: on event if condition do action ü The event part specifies when the rule is triggered ü The condition part determines if the data are in a particular state, in which case the rule fires ü The action part describes the actions to be performed if the rule fires. 13/10/2006 4

Advantages of using ECA Rules ü Allow applications reactive functionality to be defined and

Advantages of using ECA Rules ü Allow applications reactive functionality to be defined and managed within a single rule base rather than being encoded in the programs ü Use of a high-level declarative syntax and are thus amenable to analysis and optimisation techniques that cannot be applied if the functionality was encoded in the programming code 13/10/2006 5

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them?

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them? ü ECA Rules for XML ü ECA Language ü System Architecture ü Performance ü ECA Rules for RDF ü ECA Langugage ü System Architecture ü Performance 13/10/2006 6

ECA Rules for XML - Outline üDesign issues of an ECA language for XML

ECA Rules for XML - Outline üDesign issues of an ECA language for XML üThe XTL Language üImplementing an XTL rules processing system üPerformance Study 13/10/2006 7

Design issues of an ECA language for XML ü Comparing with relational triggers the

Design issues of an ECA language for XML ü Comparing with relational triggers the following are the most important XML-specific issues on designing an ECA language for XML ü Event Granularity: Specifying the granularity of where data has be modified is more complex and requires path expressions ü Action Granularity: Action may affect an entire subdocument meaning that: ü An action can trigger a different set of events ü The analysis of which events are triggered by an action cannot be based on syntax alone 13/10/2006 8

The XTL Language ü The general syntax of XTL rules is: on event if

The XTL Language ü The general syntax of XTL rules is: on event if condition do action ü Fragments of XPath and XQuery are used to specify the event, condition and action parts of XTL rules. ü XPath is used for selecting and matching fragments of XML ü XQuery is used withing actions where it is needed to construct a new XML fragment 13/10/2006 9

The XTL Language ü Event Part ü Syntax: (INSERT | DELETE) e where e

The XTL Language ü Event Part ü Syntax: (INSERT | DELETE) e where e is an XPath expression evaluating to a set of nodes. ü A rule is triggered if this set of nodes includes any node in the XML fragment inserted or deleted ü The system-defined variable $delta contains this set of nodes and is available for use in condition and action part of the rule 13/10/2006 10

The XTL Language ü Condition Part ü The condition part is either the constant

The XTL Language ü Condition Part ü The condition part is either the constant TRUE or one or more XPath expressions connected by the boolean connectives and, or, not. ü Each of these expressions is evaluated on the data to tell whether the condition is TRUE or FALSE 13/10/2006 11

The XTL Language ü Action Part: ü The action part is a sequece of

The XTL Language ü Action Part: ü The action part is a sequece of one or more actions ü Syntax: 13/10/2006 ü INSERT r BELOW e (BEFORE | AFTER) q r is an XQuery expression specifying the XML fragment to be inserted, e is an XPath expression specifying the set of nodes under which the new fragment will be inserted, q is either a constant or an XPath qualifier specifying the set of nodes BEFORE or AFTER which the new nodes will be placed. ü DELETE e e is an XPath expression specifing the set of nodes to be deleted. 12

XTL Language ü Example rule: ON INSERT doc(‘s. xml’)/shares/share/dayinfo/prices/price IF $delta > $delta/. .

XTL Language ü Example rule: ON INSERT doc(‘s. xml’)/shares/share/dayinfo/prices/price IF $delta > $delta/. . /high DO DELETE $delta/. . /high; INSERT <high>$delta/text()</high> BELOW $delta/. . AFTER prices 13/10/2006 13

XTL rule processing system 13/10/2006 14

XTL rule processing system 13/10/2006 14

XTL rule processing system Architecture ü ECA Rules Management: Validates and registers a rule

XTL rule processing system Architecture ü ECA Rules Management: Validates and registers a rule to the Rule Base ü ECA Rule Processing Engine: ü Evaluates the Event and Condition Parts of the rules and schedules their actions for execution in the Action Schedule 13/10/2006 15

System Performance ü The system performance was studied by: ü Developing an analytical model

System Performance ü The system performance was studied by: ü Developing an analytical model of the system ü Performing experiments in the actual system ü We have studied the effects of rule base indexes in system performance ü Performance criterion: ü Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction 13/10/2006 16

System Performance ü Varying quantities: ü Number of rules in the rule base ü

System Performance ü Varying quantities: ü Number of rules in the rule base ü Experiments on the actual performed with three (3) different rule sets ü XML data set: a fragment of DBLP database 13/10/2006 17

System Performance - Analytical Model ü The analytical model is a mathematical description of

System Performance - Analytical Model ü The analytical model is a mathematical description of the system behaviour ü Uses queue theory to simulate the transaction queues and database processing ü Uses a set of simplifying assumptions to emulate the behaviour of some system parameters (e. g. triggering probability, transaction arrival rate etc. ) 13/10/2006 18

System Performance - Analytical Model Results 13/10/2006 19

System Performance - Analytical Model Results 13/10/2006 19

System Performance - Analytical Model ü Response time increases non-linearly for as long as

System Performance - Analytical Model ü Response time increases non-linearly for as long as the system is stable (I. e. arrival rate in the transaction queue is less that the service rate) ü After the stability point the transaction queue grows uncontrollably large, flooding the memory and slowing it down ü Reasons: 13/10/2006 ü Everything served by a single queue ü High number of event query evaluations to find what is triggered 20

System Performance Experimental Results 13/10/2006 21

System Performance Experimental Results 13/10/2006 21

System Performance Experimental Results ü Difference with Analytical Model due to: ü implementation choices

System Performance Experimental Results ü Difference with Analytical Model due to: ü implementation choices (use of DOM etc. ) and ü the simplification assumptions made in the analytical model 13/10/2006 22

System Performance 13/10/2006 23

System Performance 13/10/2006 23

System Performance - Indexing Rule Base 13/10/2006 24

System Performance - Indexing Rule Base 13/10/2006 24

System Performance - Indexing Rule Base ü Better overall behaviour and scalability characteristics due

System Performance - Indexing Rule Base ü Better overall behaviour and scalability characteristics due to smaller number of rules that need to be checked for triggering ü Smaller number of rules checked --> smaller number of queries need to be evaluated 13/10/2006 25

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them?

Outline ü What Event-Condition-Action (ECA) Rules are and what we can do with them? ü ECA Rules for XML ü ECA Langugage ü System Architecture ü Performance ü ECA Rules for RDF ü ECA ü Performance Langugage ü System Architecture 13/10/2006 26

ECA Rules for RDF ü The RDFTL ECA Language ü Implementing RDFTL processing system

ECA Rules for RDF ü The RDFTL ECA Language ü Implementing RDFTL processing system in P 2 P environments ü System performance 13/10/2006 27

The RDFTL Language ü We have designed the language from scratch specifically for RDF

The RDFTL Language ü We have designed the language from scratch specifically for RDF ü General Syntax: ü ON event IF condition DO action 13/10/2006 28

The RDFTL Language ü Event Part: 13/10/2006 ü May contain let expressions of the

The RDFTL Language ü Event Part: 13/10/2006 ü May contain let expressions of the form: LET $var : = e ü (INSERT | DELETE) e e is a path expression that evaluates on a set of RDF nodes. Catches the insertion or deletion of a node ü (INSERT | DELETE) triple is an expression of the form (source, arc, target) specifying an RDF triple. Catches the insertion or deletion of a property in an RDF triple. ü UPDATE upd_triple is an expression of the form (source, arc, old_target->new_target). Catches the update of a 29 property from one RDF node to another.

The RDFTL Language ü Condition Part: ü It is a boolean-valued expression ü May

The RDFTL Language ü Condition Part: ü It is a boolean-valued expression ü May consist of conjunctions, disjunctions and negations ü May also contain let expressions ü The $delta variable bound to the set of nodes or arcs modified and caught by the event part ü Action Part: ü A sequence of actions ü Each action has similar syntax with the event part 13/10/2006 30

RDFTL Rules in P 2 P Environments System Architecture 13/10/2006 31

RDFTL Rules in P 2 P Environments System Architecture 13/10/2006 31

RDFTL Rules in P 2 P Environments ü Each peer (P) is supervised by

RDFTL Rules in P 2 P Environments ü Each peer (P) is supervised by a superpeer (SP) ü The set of Ps supervised by an SP form a peergroup ü At each SP there is an RDFTL processing engine installed ü Each P or SP hosts a fragment of the RDF schema that may change due to updates ü Hybrid fragmentation with possible replication 13/10/2006 32

RDFTL Rules in P 2 P Environments ü Ps notify the SPs for any

RDFTL Rules in P 2 P Environments ü Ps notify the SPs for any updates on their local data ü An ECA rule generated at one P or SP may be replicated, triggered, evaluated or executed in different sites in the net. 13/10/2006 33

Distributed Rule Registration ü A rule generated is sent from P to SP for

Distributed Rule Registration ü A rule generated is sent from P to SP for validation and storage ü From there it is sent to all other SPs ü A replica of it will be stored also to those SPs that are e-relevant to the rule. I. e. the event part queries of a rule can be evaluated on SP ü At each SP each rule is annotated with IDs of local peers that are e-, c- and a-relevant to the rule ü c- and a- relevance have a similar meaning with erelevance for the condition and action part 13/10/2006 34

Distributed Rule Execution ü Each SP manages its own rule execution schedule ü Each

Distributed Rule Execution ü Each SP manages its own rule execution schedule ü Each execution schedule is a sequence of updates to be executed on the local peergroup ü Once an update u occurs in P, SP is notified ü SP determines if u may trigger any rule whose event part is annotated with P’s ID. ü If yes, the event query is sent to P for evaluation ü If the rule is triggered, its condition will be evaluated ü If the condition is true SP will send each instance of r’s action part to local peers that are a-relevant to it 13/10/2006 35

System Performance ü The system performance was studied by: ü Developing an analytical model

System Performance ü The system performance was studied by: ü Developing an analytical model of the system ü Developing a system simulator and performing experiments with it ü Performance criterion: ü Update response time: The mean time taken to complete all rule execution resulting from a single update submitted by a top-level update transaction 13/10/2006 36

System Performance ü Cases studied with both the Analytical Model and the Simulator :

System Performance ü Cases studied with both the Analytical Model and the Simulator : ü Random Network topology between SPs, with various data replication degree ü Hyper. Cup Network topology between SPs, with various data replication degree ü Varying quantities: ü Number of peergroups ü Number of rules 13/10/2006 37

System Performance Random topology - Replication 10% Analytical Model 13/10/2006 Simulation 38

System Performance Random topology - Replication 10% Analytical Model 13/10/2006 Simulation 38

System Performance ü With random topology system does not scale well even with low

System Performance ü With random topology system does not scale well even with low replication and number of rules and peergroups ü Exponential update response time ü System becomes unusable due to high load 13/10/2006 39

System Performance ü Hyper. Cup organises the SPs into hypercubes ü Hyper. Cup topology

System Performance ü Hyper. Cup organises the SPs into hypercubes ü Hyper. Cup topology guarantees that: ü Each peer receives a message only once ü A total number of N-1 hops is necessary to broadcast a message to N peers ü The more distant peers are reached after log 2 N hops 13/10/2006 40

System Performance Hyper. Cup - Replication 10% Analytical Model 13/10/2006 Simulation 41

System Performance Hyper. Cup - Replication 10% Analytical Model 13/10/2006 Simulation 41

System Performance Hyper. Cup - Replication 90% Analytical Model 13/10/2006 Simulation 42

System Performance Hyper. Cup - Replication 90% Analytical Model 13/10/2006 Simulation 42

System Performance ü With Hyper. Cup we achieve higher performance for various replication levels

System Performance ü With Hyper. Cup we achieve higher performance for various replication levels and number of peergroups ü System scales better ü System remains stable and the update response time within acceptable values ü Analytical with simulation approach show good agreement 13/10/2006 43

Conclusions ü We have described two ECA languages for XML and RDF ü We

Conclusions ü We have described two ECA languages for XML and RDF ü We have studied and defined the architectural characteristics for an ECA rule processing system in centralised and distributed environment ü We have conducted a study to determine the system performance in both the centralised and distributed case 13/10/2006 44

Conclusions ü The whole study shows that ECA rules is a usable technology for

Conclusions ü The whole study shows that ECA rules is a usable technology for various different application environments over semistructured data 13/10/2006 45

Thank you !! 13/10/2006 46

Thank you !! 13/10/2006 46