Sensor Data Management In Sensor Networks Towards Sensor

  • Slides: 47
Download presentation
Sensor Data Management In Sensor Networks

Sensor Data Management In Sensor Networks

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – This paper defines a model

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – This paper defines a model for sensor databases – Stored data are represented as relations while sensor data are represented as time series and each long-running query formulated over a sensor database defines a persistent view, which is maintained during a given time interval – The design and implementation of the COUGAR sensor database system is also described – Applications monitor the world by querying and analyzing sensor data

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – – Examples of monitoring applications

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction – – Examples of monitoring applications include o supervising items in a factory warehouse, o gathering information in a disaster area, o organizing vehicle traffic in a large city These applications involve a combination of stored data (a list of sensors and their related attributes, such as their location) and sensor data – These will be called sensor databases – This paper focuses on sensor query processing – the design, algorithms, and implementations used to run queries over sensor databases – A sensor query is defined as a query expressed over a sensor database

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction Factory Warehouse Example – A sensor

Towards Sensor Database Systems [Bonnet+ 2001] I. Introduction Factory Warehouse Example – A sensor query is defined as a query expressed over a sensor database – Each item of a factory warehouse has a stick-on temperature sensor attached to it as well as other attached sensors are to walls and embedded in floors and ceilings – Each sensor provides two signal-processing functions: 1. get. Temperature() returns the measured temperature at regular intervals, and 2. detect. Alarm. Temperature(threshold) returns the temperature whenever it crosses a certain threshold – Each sensor is able to communicate this data and/or to store it locally

Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – The sensor database stores

Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – The sensor database stores the identifier of all sensors in the warehouse along with their location and is connected to the sensor network – The sensor database is used to make sure that items do not overheat – Typical queries that are run continuously may include: o Query 1: “Return repeatedly the abnormal temperatures measured” o Query 2: “Every minute, return the temperature measured on the third floor” o Query 3: “Generate a notification whenever two sensors within 5 yards of each other simultaneously measure an abnormal temperature” o Query 4: “Every five minutes retrieve the maximum temperature measured over the last five minutes” o Query 5: “Return the average temperature measured on each floor over the last 10 minutes”

Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – These examples queries have

Towards Sensor Database Systems [Bonnet+ 2001] Factory Warehouse Example – These examples queries have the following characteristics: o Monitoring queries are long running o The desired result of a query is typically a series of notifications of system activity (periodic or triggered by special situations) o Queries need to correlate data produced simultaneously by different sensors o Queries need to aggregate sensor data over time windows o Most queries contain some condition restricting the set of sensors that are involved (usually geographical conditions) – Queries are formulated regardless of the physical structure or the organization of the sensor network since the actual structure and population of a sensor network may vary over the lifespan of a query – There are similarities with relational database query processing – most applications combine sensor data with stored data

Towards Sensor Database Systems [Bonnet+ 2001] – Sensor data differs from traditional relational data

Towards Sensor Database Systems [Bonnet+ 2001] – Sensor data differs from traditional relational data since it is not stored in a database server and it varies over time – There are two approaches for processing sensor queries: o Warehousing approach: ü represents the current state-of-the-art ü processing of sensor queries and access to the sensor network are separated – the sensor network is used by a data collection mechanism ü suited for answering predefined queries over historical data ü proceeds in two steps: (i) data is extracted from sensor network in a predefined manner and is stored in a database located on a unique front-end server; (ii) query processing takes place on the centralized database

Towards Sensor Database Systems [Bonnet+ 2001] – Distributed approach: ü this approach is the

Towards Sensor Database Systems [Bonnet+ 2001] – Distributed approach: ü this approach is the focus of this paper ü the query workload determines the data to be extracted from sensors ü provides flexibility – different queries extract different data from the sensor network – and efficient – only relevant data are extracted from the sensor network ü it allows the sensor database system to leverage the computing resources on the sensor nodes: a sensor query can be evaluated at the front-end server, in the sensor network, at the sensors, or at some combination of the three – Sensor database system should deal with sensor and communication failures; it should consider sensor data as measurements with an associated uncertainty not as fact; it should establish and run a distributed query execution plan without assuming global knowledge

Towards Sensor Database Systems [Bonnet+ 2001] – The paper has the following contributions: o

Towards Sensor Database Systems [Bonnet+ 2001] – The paper has the following contributions: o Built on the results of [Seshadri+ 1995] to define a data model and longrunning queries semantics for sensor databases. A sensor database mixes stored data and sensor data. Stored data are represented as relations while sensor data are represented as time series. Each long-running query defines a persistent view that is maintained during a given time interval. o Described the design and implementation of the Cornell COUGAR sensor database system where COUGAR extends the Cornell PREDATOR objectrelational database system. In COUGAR, each type of sensor is modeled as a new Abstract Data Type (ADT). Signal-processing functions are modeled as ADT functions that return sensor data. Long-running queries are formulated in SQL. To support the evaluation of long-running queries, the query execution engine is extended with a new mechanism for the execution of sensor ADT functions

Towards Sensor Database Systems [Bonnet+ 2001] II. A Model for Sensor Database Systems –

Towards Sensor Database Systems [Bonnet+ 2001] II. A Model for Sensor Database Systems – Build on existing work by [Seshadri+ 1995] to define a data model for sensor data and an algebra of operators to formulate sensor queries II. A. Sensor Data – A sensor database involves stored data and sensor data – Stored data include the set of sensors participating in the sensor database along with characteristics of the sensors (e. g. , their location) or characteristics of the physical environment – These stored data are represented as relations – The question: how to represent sensor data?

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – Sensor data are

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – Sensor data are generated by signal processing functions and the representation chosen for sensor data should formulate sensor queries (data collection, correlation in time, and aggregates over time windows) – Time is essential -- signal processing functions may return output repeatedly over time, and each output has a time-stamp – In addition, monitoring queries introduce constraints on the sensor data time-stamps, e. g. , Query 3 in Example 1 assumes that the abnormal temperatures are detected either simultaneously or within a certain time interval. Queries 4 and 5; on the other hand, aggregates over time windows and reference time explicitly

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – Sensor data is

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – Sensor data is represented as time series – Representation of sensor time series are based on the sequence model introduced by [Seshadri+ 1995] – A sequence is defined as a 3 -tuple comprised of o a set of records R o a countable totally ordered domain O (ordering domain – the elements of the ordering domain are referred to as positions) o an ordering of R by O (defined as a relation between O and R, such ( that every record in R is associated with some position in O – Sequence operators are n-ary mappings on sequences; they operate on a given number of input sequences producing a unique output sequence

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – All sequence operators

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data – All sequence operators can be composed – Sequence operators include: select, project, compose (natural join on the position), and aggregates over a set of positions – Sensor data as a time series is represented with the following properties: 1. The set of records corresponds to the outputs of a signal processing function over time 2. The ordering domain is a discrete time scale, i. e. a set of time quantum where each time quantum corresponds a position. Natural numbers are used as the time-series ordering domain. Each natural number represents the number of time units elapsed between a given origin and any (discrete) point in time. It is assumed that clocks are synchronized and thus all sensors share the same time scale

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data 3. All outputs of

Towards Sensor Database Systems [Bonnet+ 2001] II. A. Sensor Data 3. All outputs of the signal processing function generated during a time quantum are associated to the same position p. In case a sensor does not generate events during the time quantum associated to a position, the Null record is associated to that position 4. Whenever a signal processing function produces an output, the base sequence is updated at the position corresponding to the production time. Updates to sensor time series occur in increasing position order II. B. Sensor Queries – A sensor database involves stored data and sensor data, i. e. , relations and sequences – Sensor query is defined as an acyclic graph of relational and sequence operators

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – The inputs of

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – The inputs of a relational operator are either base relations or the output of another relational operator; the inputs of a sequence operator are either base sequences or the output of another sequence operator, i. e. relations are manipulated using relational operators and sequences are manipulated using sequence operators – There are three exceptions to this rule – three operators allow combining relations and sequences: o the relational projection operator can take a sequence as input and project out the position attribute to obtain a relation o a cross product operator can take as input a relation and a sequence to produce a sequence o an aggregate operator can take a sequence as input and a grouping list that does not include the position attribute

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – Sensor queries are

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – Sensor queries are long running – Each sensor query is associated a time interval of the form [O, O + T] where O is the time at which it is submitted and T is the number of time quantums during which it is running – During the life of long-running query, relations and sensor sequences may be updated – An update to a relation R can be an insert, a delete, or modifications of a record in R, whereas, an update to a sensor sequence S is the insertion of a new record associated to a position greater than or equal to all the undefined positions in S

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – A sensor query

Towards Sensor Database Systems [Bonnet+ 2001] II. B. Sensor Queries – A sensor query defines a view that is persistent during its associated time interval where this persistent view is maintained to reflect the updates that are repeatedly performed on sensor time series – [Jagadish+ 1995] presented that persistent views over relations and sequences could be maintained incrementally without accessing the complete sequences – Informally, persistent views can be maintained incrementally if updates occur in increasing position order and if the algebra used to compose queries does not allow sequences to be combined using any relational operators – Both conditions hold in the definition of a sensor database used in this paper

Towards Sensor Database Systems [Bonnet+ 2001] III. The COUGAR Sensor Database System – The

Towards Sensor Database Systems [Bonnet+ 2001] III. The COUGAR Sensor Database System – The initial version of COUGAR system has been evaluated in the following aspects: 1. User representation: o How are sensors and signal processing functions modeled in the database schema? o 2. How are queries formulated? Internal representation: o How is sensor data represented within the database components that perform query processing? o How are sensor queries evaluated to provide the semantics of longrunning queries?

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – In COUGAR, signal-processing

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – In COUGAR, signal-processing functions are represented as Abstract Data Type (ADT) and a Sensor ADT is considered for all sensors of a same type (e. g. , temperature sensors, seismic sensors) – The public interface of a Sensor ADT corresponds to the specific signalprocessing functions supported by a type of sensor whereas an ADT object in the database corresponds to a physical sensor in the real world – Sensor queries are formulated in SQL with small modifications to the language – The ‘FROM’ clause of a sensor query includes a relation whose schema contains a sensor ADT attribute while the expressions over sensor ADTs are included in either the ‘SELECT’ or the ‘WHERE’ clause of a sensor query

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – The queries introduced

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – The queries introduced earlier are formulated in COUGAR as follows: o The simplified schema of the sensor database contains one relation R(loc point, floor int, s sensor. Node), where loc is a point ADT that stores the coordinates of the sensor, floor is the floor where the sensor is located in the data warehouse and sensor. Node is a Sensor ADT that supports the methods get. Temp() and detect. Alarm. Temp(threshold), where threshold is the threshold temperature above which abnormal temperatures are returned o Both ADT functions return temperature represented as float

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 1: “Return

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 1: “Return repeatedly the abnormal temperatures measured by all sensors” SELECT R. s. detect. Alarm. Temp(100) FROM R WHERE $every(); The expression $every() is introduced as a syntactical construct to indicate that the query is long-running – Query 2: “Every minute, return the temperature measured by all sensors on the third floor” SELECT R. s. get. Temp() FROM R WHERE R. floor = 3 AND $every(60); The expression $every() takes as argument the time in seconds between successive outputs of the sensor ADT functions in the query

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 3: “Generate

Towards Sensor Database Systems [Bonnet+ 2001] III. A User Representation – Query 3: “Generate a notification whenever two sensors within 5 yards of each other measure simultaneously an abnormal temperature” SELECT R 1 s. detect. Alarm. Temp(100), R 2. s. detect. Alarm. Temp (100) FROM R R 1, R R 2 WHERE $SQRT($SQR(R 1. loc. x – R 2. loc. x) + $SQR( R 1. loc. y – R 2. loc. y)) < 5 AND R 1. s > R 2. s AND $every(); This formulation assumes that the system incorporates an equality condition on the time at which the temperatures are obtained from both sensors. – Queries 4 and 5 cannot be expressed in the initial COUGAR since aggregates over time windows are not supported – Time interval associated with long-running queries in COUGAR is the interval between the instant the query is submitted and the instant the query is explicitly stopped

Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – Query processing takes

Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – Query processing takes place on a database front-end while signalprocessing functions are executed on the sensor nodes involved in the query – The query execution engine on the database front-end includes a mechanism for interacting with remote sensors where a query execution engine in each sensor executes signal processing functions and sends data back to the front-end – In COUGAR, it is assumed that there are no modifications to the stored data during the execution of a long-running query -- strict two-phase locking on the database front-end ensures verification of this assumption

Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – The initial version

Towards Sensor Database Systems [Bonnet+ 2001] III. B Internal Representation – The initial version of COUGAR does not consider a long-running query as a persistent view; the system computes the incremental results that could be used to maintain a view where these incremental results are obtained by evaluating sensor ADT functions repeatedly and by combining the outputs they produce over time with stored data – The execution of Sensor ADT functions is essential for sensor queries execution

Towards Sensor Database Systems [Bonnet+ 2001] Advantages: – Distributed approach makes it efficient since

Towards Sensor Database Systems [Bonnet+ 2001] Advantages: – Distributed approach makes it efficient since only the relevant data are extracted from the WSN under consideration, hence reducing the communication and processing overhead – The representation of the processing function as ADT provides controlled access to encapsulated data through a well-defined set of functions – The authors chose not to reinvent the wheel as the sensor queries are formulated in SQL with little modification to the language – The use of Virtual Relations introduces more flexibility

Towards Sensor Database Systems [Bonnet+ 2001] Disadvantages: – Since the authors are proposing a

Towards Sensor Database Systems [Bonnet+ 2001] Disadvantages: – Since the authors are proposing a sensor DB system, how does their system adhere to the ACID properties of a DB needs to be mentioned – The protocol assumes that the sensed data is all stored in the sensor node – this may introduce a space constraint in the sensor node – The sensor data is time variant and after a certain time the data would be outdated – The authors do not suggest any rule for what time duration the data should be held in the node

Towards Sensor Database Systems [Bonnet+ 2001] Suggestions/Improvements/Future Work: – Handle multiple copies of same

Towards Sensor Database Systems [Bonnet+ 2001] Suggestions/Improvements/Future Work: – Handle multiple copies of same kind of data available from nearby resources – Provide adaptive query processing mechanism – Handle mobile nodes and sinks

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – The paper

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – The paper discusses the challenges associated with implementing the five basic database aggregates (COUNT, MIN, MAX, SUM, and AVERAGE) – The network aggregation approach discussed in this paper is driven by a general purpose, SQL-style interface that can execute queries over any kind of sensor data irregardless of the application – There are two benefits of this approach over the traditional network solution which is generally application dependent: o Computation can be optimized by defining the language that users use to express aggregates o Since the same aggregation language can be applied to all data types, the burden on programmers is substantially less: they can issue declarative, SQL style queries rather than implementing custom networking protocols to extract the needed data from the network

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – The paper

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Introduction – The paper presents a variety of techniques to improve the reliability and performance of the proposed solution – In addition, it is shown how grouped aggregates can be efficiently computed and offered a comparison to related systems and database projects – Two properties of radio communication need to be pointed out: o Radio is a broadcast medium such that any sensor within hearing distance can hear any message irrespective of whether or not it is the intended recipient o Radio links are typically symmetric: if a sensor a can hear sensor b, it is assumed that sensor b can also hear sensor a; however, this may not hold true in some cases

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – Messages in

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – Messages in the current generation of Tiny. OS are a fixed size preprogrammed into sensors – Each message type has a message id that distinguishes it from other types of messages and each sensor has a unique sensor id that distinguishes it from other sensors – All messages specify their recipient (or broadcast), allowing sensors to ignore messages not intended for them, although non-broadcast messages must still be received by all sensors within range – unintended recipients drop messages not addressed to them – The technique adopted is to build a routing tree to route sensor data – One sensor, typically interfaces the querying user to the rest of the network, is chosen to be the root from which the tree will be built and where the aggregated data will converge

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – The root

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – The root broadcasts a message for sensors to organize into a routing tree and it includes its own id and level (or distance from the root) – Any sensor hearing this message assigns its own level to be the level in the message plus one only if its current level is not already less than or equal to the level in the message – It chooses the sender of the message as its parent – Each of these sensors then broadcasts the routing message, along with their own ids and levels – The routing message floods down the tree with each node rebroadcasting the message until all nodes have been assigned a level and a parent – Node that hear multiple parents chose one randomly

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – These routing

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Background – These routing messages are periodically broadcast from the root such that the process of topology discovery goes on continuously – This topology maintenance makes adaptation to network changes easier since each sensor looks at the history of received routing messages, chooses the best parent and ensures no routing cycles are created – This method allows efficient route data towards the root – In order a message to reach the root, a sensor nodes sends a message to its parent which in turn forwards the message on to its parents and so on – Even though this approach does not address point-to-point routing, flooding aggregation requests and routing replies up the tree to the route is sufficient

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Aggregation in Database Systems

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Aggregation in Database Systems – Aggregation in SQL-based database systems is defined by an aggregate function and a grouping predicate – The aggregate function specifies how a set of values should be combined to compute an aggregate; the standard set of SQL aggregate functions is COUNT, MIN, MAX, AVERAGE, and SUM – These compute functions such as the following SQL statement: SELECT AVERAGE(temp) FROM sensors computes the average temperature from some table sensors, which represents a set of sensor readings that have been read into the system – Similarly, the COUNT function counts the number of items in a set, the MIN and MAX functions compute minimal and maximal values, and SUM calculates the total of all values

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Aggregation in Database Systems

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Aggregation in Database Systems – In addition, most database systems allow user-defined functions (UDFs) that specify more complex aggregates than the five listed above – Rather than merely computing a single aggregate value over the entire set of data values, a grouping predicate partitions the values into groups based on some attribute – For example, the query: SELECT TRUNC(temp/10), AVERAGE(light) FROM sensors GROUP BY TRUNC(temp/10) HAVING AVERAGE(light) partitions sensor readings into groups according to their temperature reading and computes the average light reading within each group -- HAVING clause excludes groups whose average light readings are less than or equal to 50

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques –

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques – There are two approaches: 1. Server-based – centralized approach where all sensor readings are sent to host PC that computes the aggregates 2. In-network – distributed approach where aggregates are partially or fully computed by the sensors themselves as readings are routed through the network towards the host PC – This paper focuses on distributed in-network aggregation approach – Figure 1 illustrates the benefits of in-network approach where dotted lines represent connections between sensors and solid lines represent the routing tree imposed on top of this graph to allow sensors to propagate data to the root along a single path

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques –

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques – In the centralized approach, each sensor value must be routed to the root of the network; for a node at depth n, this requires n-1 messages to be transmitted per sensor – The sensors in Figure 1(a) have been labeled with their distance from the root; summing these numbers gives a total of sixteen messages required to route all aggregation information to the root – The sensors in Figure 1(b): sensors with no children transmit their readings to their parents – Intermediate nodes (with children) combine their own readings with the readings of their children via the aggregation function f and propagate the partial aggregate, along with any extra data required to update the aggregate, up the tree

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques Figure

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques Figure 1: Server-based (a) vs. In-network (b) aggregation. In (a), each node is labelled with the number of messages required to get data to the host PC: a total of 16 messages are required. In (b), only one message is sent along each edge as aggregation is performed by the sensors themselves

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques –

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Generic Aggregation Techniques – Amount of data transmitted depends on the aggregate – For example, AVERAGE function will be calculated by the sum and the count of all children’s sensor readings while other standard SQL aggregates such as COUNT, MIN, MAX, and SUM can be computed by a parent node given sensor or partial aggregate values at all of the child nodes – This work focuses on a class of aggregation predicates that can be expressed as an aggregate function f over the sets a and b such that:

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Injecting a Query –

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Injecting a Query – Computing an aggregate consists of two phases: a propagation phase, in which aggregate queries are pushed down into sensor networks, and an aggregation phase, where the aggregate values are propagated up from children to parents – Leaf nodes must find out that they are leaves and propagate singular aggregates up to their parents – When a sensor p receives an aggregate a, it transmits a and begins listening – If p has any children, it will hear those children re-transmit a to their children, and figure out that it is not a leaf – After time interval t, if p has not heard any children, it means that it is a leaf and transmits current sensor value up the routing tree

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Injecting a Query –

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Injecting a Query – If p has children, they are supposed to report within time t, and after time t, it computes the value of a applied to its own value and the value of its children and forwards this partial aggregate to its parent – It is essential to note that short duration for t can lead to missed reports from children, and also the proper value of t varies depending on the depth of the routing tree

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates – Since

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates – Since sensor networks are inherently unreliable, it is difficult to guarantee that portion of a sensor network was not detached during an aggregate computation – The paper proposes pipelined aggregate – The aggregates are propagated into the network and time is divided into intervals of duration i – During each interval, each sensor that has heard the request to aggregate transmits a partial aggregate by applying a to its local reading and the values of its children reported during previous interval – After the first interval, root hears from one-hop away sensors – After the second, it hears aggregates of sensors one and two hops away and so on

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates – The

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates – The pipelined solution has the following properties: o After aggregates have propagated up from leaves, a new aggregate arrives every i seconds o Total time for an aggregation request to propagate down to the leaves and back to the root is about t, but the user starts seeing approximations of the aggregate after the first interval has elapsed – These properties give users a stream of aggregate values that changes as sensor readings and the underlying network change – Continuous results are more useful than a single aggregate since they provide users with understanding of how the network is behaving over time – Figure 2 illustrates a simple aggregate running in a pipelined approach

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates Figure 2:

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Streaming Aggregates Figure 2: Pipelined computation of aggregates –The important drawback of this approach is the number of additional messages transmitted to extract the first aggregate over all sensors –The non-pipelined aggregate requires one down and one back along with each edge

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Advantages: – Routing messages

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Advantages: – Routing messages for tree building are broadcast periodically, thus any changes in the topology are updated – The method of pipelined aggregate overcomes the need to have multiple transmissions of aggregate queries. Any nodes that miss the query in an interval have the time to catch up in the successive ones – Using pipelined aggregates, the user sees approximates of the aggregates. This continuous results is more useful then one single, isolated result

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Disadvantages: – No suggestions

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Disadvantages: – No suggestions made on any protocol for choosing a node as the root – The node designated as the root will have to cope with a lot of data constraining on the battery power of the root node – The radio of the nodes would have to be kept ON most of the times to facilitate snooping – Using pipelined aggregation increases the number of messages that are transmitted to the node, which contradicts the goal of reducing the number of messages

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Suggestions/Improvements/Future Work: – Introduce

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks [Madden+ 2002] Suggestions/Improvements/Future Work: – Introduce hybrid pipeline scheme in order to balance tradeoff between robustness, incorporating nodes that lose initial aggregation requests and number of messages passed – There is a need to work towards adaptive aggregation algorithms that are going to be self-configuring – Introduce a query layer for WSN that will accept queries in declarative language, which can be optimized to generate efficient query execution plan with in-network processing

References [Bonnet+ 2001] Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri, Towards Sensor Database

References [Bonnet+ 2001] Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri, Towards Sensor Database Systems, In Proceedings of the Second International Conference on Mobile Data Management. Hong Kong, January 2001. [Jagadish+ 1995] H. V. Jagadish, I. S. Mumick, and A. Silberschatz, View Maintenance Issues for the Chronicle Data Model, PODS 1995, pp. 113 -124. [Madden+ 2002] S. Madden, R. Szewczyk, M. J. Franklin, and D. Culler, Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks, In proceedings of the 4 th IEEE Workshop on Mobile Computing and Systems Applications, 2002. [Seshadri+ 1995] P. Seshadri, M. Livny, and R. Ramakrishnan, SEQ: A Model for Sequence Databases, ICDE 1995, pp. 232 -239. [Yao+ 2003] Y. Yao and J. E. Gehrke, Query Processing in Sensor Networks, In Proceedings of the First Biennial Conference on Innovative Data Systems Research (CIDR 2003), Asilomar, California, January 2003.