TSAR A Two Tier Sensor Storage Architecture Using

  • Slides: 33
Download presentation
TSAR*: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs (*Tiered Storage ARchitecture)

TSAR*: A Two Tier Sensor Storage Architecture Using Interval Skip Graphs (*Tiered Storage ARchitecture) Peter Desnoyers, Deepak Ganesan, and Prashant Shenoy University of Massachusetts, Amherst Department of Computer Science University of Massachusetts, Amherst

Why do we need archival storage? Applications need historical sensor information. Why? Trigger events:

Why do we need archival storage? Applications need historical sensor information. Why? Trigger events: • Traffic monitoring - crash • Surveillance - break-in • Environmental monitoring - natural disaster lead to requests for past information. This requires archival storage. UNIVERSITY OF MASSACHUSETTS, AMHERST

Existing storage and indexing approaches ◊Streaming query systems ¨ Tiny. DB (Madden 2005), etc.

Existing storage and indexing approaches ◊Streaming query systems ¨ Tiny. DB (Madden 2005), etc. ¨ Data storage and indexing is performed outside of network. Optimized for continuous queries. High energy cost if used for archival - data must be transmitted to central data store. ◊In-network storage and indexing ¨ DCS, GHT (Ratnasamy 2002) ¨ Dimensions (Ganesan 2003) ¨ Directed Diffusion (Intangonwiwat 2000) Limited by lack of sufficient, energy-efficient storage and of communication and computation resources on current sensor platforms. UNIVERSITY OF MASSACHUSETTS, AMHERST

Technology Trends Radio m. J/byte Max Flash m. J/byte size Mica 2 30 4.

Technology Trends Radio m. J/byte Max Flash m. J/byte size Mica 2 30 4. 5 0. 5 MB Mica. Z 3. 4 4. 5 0. 5 MB Telos 3. 4 1 1 MB 0. 01 >1 GB UMass NAND New flash technologies enable large storage systems on small energyconstrained sensors. 100 x 1000 x UNIVERSITY OF MASSACHUSETTS, AMHERST

Hierarchical Storage and Indexing Hierarchical deployments are being used to provide scaling: • James

Hierarchical Storage and Indexing Hierarchical deployments are being used to provide scaling: • James Reserve (CENS) Higher powered micro-servers are deployed alongside resource constrained sensor nodes. Application Proxies Key challenge: • Exploit proxy resources to perform intelligent search across data on resourceconstrained nodes. Sensors UNIVERSITY OF MASSACHUSETTS, AMHERST

Key Ideas in TSAR ◊ Exploit storage trends for archival. ¨ Use cheap, low-power,

Key Ideas in TSAR ◊ Exploit storage trends for archival. ¨ Use cheap, low-power, high capacity flash memory in preference to communication. ◊ Index at proxies and store at sensors. ¨ Exploit proxy resources to conserve sensor resources and improve system performance. ◊ Extract key searchable attributes. ¨ Distill sensor data into concise attributes such as ranges of time or value that may be used for location and retrieval but require less energy to transmit. UNIVERSITY OF MASSACHUSETTS, AMHERST

TSAR Architecture 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to

TSAR Architecture 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to locate data stored on sensors in response to queries. 2. Summarization process • Extracts identifying information: e. g. time period during which events were detected, range of event values, etc. 3. Local sensor data archive • Stores detailed sensor information: e. g. images, events. Sensor node archive UNIVERSITY OF MASSACHUSETTS, AMHERST

TSAR Architecture 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to

TSAR Architecture 1. Interval Skip Graph-based index between proxies. • Exploit proxy resources to locate data stored on sensors in response to queries. 2. Summarization process Summarization function • Extracts identifying information: e. g. time period during which events were detected, range of event values, etc. 3. Local sensor data archive • Stores detailed sensor information, e. g. images, events. UNIVERSITY OF MASSACHUSETTS, AMHERST

TSAR Architecture 1. Interval Skip Graph-based index between proxies Distributed index • Exploit proxy

TSAR Architecture 1. Interval Skip Graph-based index between proxies Distributed index • Exploit proxy resources to locate data stored on sensors in response to queries. 2. Summarization process • Extracts identifying information: e. g. time period during which events were detected, range of event values, etc. 3. Local sensor data archive • Stores detailed sensor information, e. g. images, events. UNIVERSITY OF MASSACHUSETTS, AMHERST

Example - Camera Sensing Sensor archives information and transmits summary to proxy. Cyclops camera

Example - Camera Sensing Sensor archives information and transmits summary to proxy. Cyclops camera image Birds(t 1, t 2)=1 summarize <id> storage UNIVERSITY OF MASSACHUSETTS, AMHERST <id> Summary handle Sensor node

Example - Indexing Index Network of proxies Birds(t 1, t 2)=1 <id> Birds t

Example - Indexing Index Network of proxies Birds(t 1, t 2)=1 <id> Birds t 1, t 2 1 <id> proxy Summary and location information are stored and indexed at proxy. UNIVERSITY OF MASSACHUSETTS, AMHERST

Example - Querying and Retrieval Cyclops camera Birds in interval (t 1, t 2)?

Example - Querying and Retrieval Cyclops camera Birds in interval (t 1, t 2)? summarize Birds Cyclops camera t 1, t 2 1 <id> summarize proxy Query is sent to any proxy. UNIVERSITY OF MASSACHUSETTS, AMHERST

Example - Querying and Retrieval Cyclops camera Birds in interval (t 1, t 2)?

Example - Querying and Retrieval Cyclops camera Birds in interval (t 1, t 2)? summarize Birds Cyclops camera t 1, t 2 1 <id> summarize <id> proxy Index is used to locate sensors holding matching records. UNIVERSITY OF MASSACHUSETTS, AMHERST

Example - Querying and Retrieval Cyclops camera summarize Birds Cyclops camera t 1, t

Example - Querying and Retrieval Cyclops camera summarize Birds Cyclops camera t 1, t 2 1 <id> proxy Record is retrieved from storage and returned to application. UNIVERSITY OF MASSACHUSETTS, AMHERST

Outline of Talk ◊ Introduction and Motivation ◊ Architecture ◊ Example ◊ Design Skip

Outline of Talk ◊ Introduction and Motivation ◊ Architecture ◊ Example ◊ Design Skip Graph ¨ Interval Search ¨ Interval and Sparse Interval Skip Graph ¨ ◊ Experimental Results ◊ Related Work ◊ Conclusion and Future Directions UNIVERSITY OF MASSACHUSETTS, AMHERST

Goals of Index Structure The index should: • support range queries over time or

Goals of Index Structure The index should: • support range queries over time or value, (| |)? • be fully distributed among proxies, and • Support interval keys indicating a range in time or value. Distributed index insert(| UNIVERSITY OF MASSACHUSETTS, AMHERST |)

What is a Skip Graph? (Aspnes & Shah, 2003, Harvey et al. 2003) Distributed

What is a Skip Graph? (Aspnes & Shah, 2003, Harvey et al. 2003) Distributed extension of Skip Lists (Pugh ‘ 90): Probabilistically balanced no global rebalancing needed. Ordered by key - provides efficient range queries. Fully distributed - data is indexed in place. 2 3 5 6 9 12 18 19 Properties: Log(N) search and insert No single root - load balancing, robustness UNIVERSITY OF MASSACHUSETTS, AMHERST Single key and associated pointers

Interval search Query: x=4 9 8 10 6 8 5 4 2 3 2

Interval search Query: x=4 9 8 10 6 8 5 4 2 3 2 5 1 3 0 0 1 2 3 4 5 6 7 8 Given intervals [low, high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X UNIVERSITY OF MASSACHUSETTS, AMHERST 9 10

Interval search Query: x=4 8 9 6 10 5 2 4 2 3 1

Interval search Query: x=4 8 9 6 10 5 2 4 2 3 1 5 0 0 8 3 1 2 3 4 5 6 7 8 9 10 Given intervals [low, high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X UNIVERSITY OF MASSACHUSETTS, AMHERST

Interval search Query: x=4 8 9 6 10 8 5 2 4 2 3

Interval search Query: x=4 8 9 6 10 8 5 2 4 2 3 1 5 0 0 3 1 2 3 4 5 6 7 8 Given intervals [low, high] and query X: 1 - order by low 2 - find first interval with high <= X 3 - search until low > X UNIVERSITY OF MASSACHUSETTS, AMHERST 9 10

Simple Interval Skip Graph Method: Index two increasing values: low, max Search on either

Simple Interval Skip Graph Method: Index two increasing values: low, max Search on either as needed. 0 -3 0 -1 1 -5 2 -4 5 -8 6 -10 8 -9 9 -12 3 3 5 5 8 10 10 12 Derived from Interval Tree, Interval keys: YES Cormen et al. 1990 log. N search: YES log. N update: NO - (worst case O(N)) UNIVERSITY OF MASSACHUSETTS, AMHERST

Sparse Interval Skip Graph Goal: efficient update of max(high) values in Interval Skip Graph.

Sparse Interval Skip Graph Goal: efficient update of max(high) values in Interval Skip Graph. M proxies Approach: take advantage of ratio of proxies (M) to data items (N) Solution: eliminate redundant links and corresponding updates. Before: complete search tree rooted at each data item. After: retain M trees, one rooted at each proxy, keeping robustness and load balancing properties. - - - Worst-case complexity: Search: O(log. M) Update: O(M) UNIVERSITY OF MASSACHUSETTS, AMHERST N data items

Adaptive Summarization How accurately should the summary information represent the original data? Detailed summaries

Adaptive Summarization How accurately should the summary information represent the original data? Detailed summaries = more summaries, precise index updates queries Precise index = fewer wasted queries UNIVERSITY OF MASSACHUSETTS, AMHERST

Adaptive Summarization How accurately should the summary information represent the original data? Approximate summaries

Adaptive Summarization How accurately should the summary information represent the original data? Approximate summaries = fewer summaries, imprecise index updates queries ? ? imprecise index = more wasted queries UNIVERSITY OF MASSACHUSETTS, AMHERST

Adaptive Summarization Goal: balance update and query cost. Approach: adaptation. a = summarization (summaries

Adaptive Summarization Goal: balance update and query cost. Approach: adaptation. a = summarization (summaries / data) r = EWMA( wasted queries / data ) Target range: r 0 Decrease a if: r > r 0 Increase a if: r < updates queries UNIVERSITY OF MASSACHUSETTS, AMHERST +e r 0 - e

Prototype and Experiments ◊ Software: Em* (proxy), Tiny. OS (sensor) ◊ Hardware: Stargate Mica

Prototype and Experiments ◊ Software: Em* (proxy), Tiny. OS (sensor) ◊ Hardware: Stargate Mica 2 mote ◊ Network: 802. 11 ad-hoc, multihop BMAC 11% ◊ Data: James Reserve [CENS] dataset 30 s temperature readings 34 days For physical experiments, data stream was stored on sensor node and replayed. UNIVERSITY OF MASSACHUSETTS, AMHERST

Index performance How does the index performance scale with the number of proxies and

Index performance How does the index performance scale with the number of proxies and size of dataset? Queries Interval skip graph index Tested in: Em* emulation Tasks: insert, query Variables: number of proxies (1 -48) size of dataset Metric: proxy-to-proxy messages UNIVERSITY OF MASSACHUSETTS, AMHERST Sensor data

Index results Sparse skip graph provides >2 x decrease in message traffic for small

Index results Sparse skip graph provides >2 x decrease in message traffic for small numbers of proxies. Sparse skip graph shows virtually flat message cost for larger index sizes. UNIVERSITY OF MASSACHUSETTS, AMHERST

Query performance What is the query performance on real hardware and real data? queries

Query performance What is the query performance on real hardware and real data? queries 4 -proxy network Tested on: 4 Stargate proxies 12 Mica 2 sensors in tree configuration Task: query 3 -level multi-hop sensor field Variables: size of dataset Metric: query latency (ms) data UNIVERSITY OF MASSACHUSETTS, AMHERST

Query results Sensor link latency dominates Proxy link delay is negligible The sensor communication

Query results Sensor link latency dominates Proxy link delay is negligible The sensor communication consists only of a query and a response the minimal communication needed to retrieve the data. Validates the approach of using proxy resources to minimize the number of expensive sensor operations. UNIVERSITY OF MASSACHUSETTS, AMHERST

Adaptive Summarization How well does the adaptation mechanism track changes in conditions? Varied query

Adaptive Summarization How well does the adaptation mechanism track changes in conditions? Varied query rate Query/data = 0. 2 Query/data = 0. 1 1/a Query/data =0. 03 Tested in: Em*, EMTOSSIM emulation Task: data and queries Variables: query/data ratio Metric: Summary rate adapts summarization factor a Summary algorithm adapts to data and query dynamics. UNIVERSITY OF MASSACHUSETTS, AMHERST

Related Work ◊ In-network Storage: ¨ ¨ ¨ DCS (Ratnasamy 2002) Dimensions (Ganesan 2003)

Related Work ◊ In-network Storage: ¨ ¨ ¨ DCS (Ratnasamy 2002) Dimensions (Ganesan 2003) … ◊ In-network Indexing: ¨ ¨ GHT (Ratnasamy 2002) DIFS (Greenstein 2003) DIM (Li 2003) … ◊ Hierarchical Sensor Systems: ¨ Tenet (CENS, USC) ◊ Sensor Flash File Systems: ¨ ¨ ELF (Dai 2004) Matchbox (Hill et al. 2000) UNIVERSITY OF MASSACHUSETTS, AMHERST

Conclusions and Future Work ◊ Proposed novel Interval Skip Graph-based index structure and adaptive

Conclusions and Future Work ◊ Proposed novel Interval Skip Graph-based index structure and adaptive summarization mechanism for multi-tier sensor archival storage. ◊ Implemented these ideas in the TSAR system. ◊ Demonstrated index scalability, query performance, and adaptation of summarization factor, both in emulation and running on real hardware. Future Work ◊ Investigate other index structures. ◊ Alternate interval- and non-interval-based summary mechanisms. For more information: http: //presto. cs. umass. edu UNIVERSITY OF MASSACHUSETTS, AMHERST