Querying in Wireless Sensor Networks Bhaskar Krishnamachari Ming

Querying in Wireless Sensor Networks Bhaskar Krishnamachari Ming Hsieh Department of Electrical Engineering USC Viterbi School of Engineering AISP Workshop, May 2, 2007 1

Prior Work: Phase Transitions and Complexity in Wireless Networks Example: Interference-Free Channel Allocation Work with Ramon Bejar, Stephen Wicker, Cesar Fernandez, Bart Selman, Ashish Goel, Sanatan Rai 2

Wireless Sensor Networks • Large scale networks of small embedded devices, each with sensing, computation and communication capabilities. • Use of wireless networks of embedded computers “could well dwarf previous milestones in the information revolution” – National Research Council Report: Embedded, Everywhere, 2001. 3

Wide Ranging Applications Structural monitoring Disaster management Bio-habitat monitoring Military surveillance Industrial monitoring Home/building security 4 Note: images used may be copyrighted. Used here for limited educational purposes only. Not intended for commercial or public use.

Two Paradigms • Continuous collection • Distributed storage and querying 5

Focus of this Talk • Analysis and Design of Mechanisms for Storage and Querying: – Fundamental Scaling Laws – Comparison of Push-Pull Query Mechanisms – Enhancing Random Walk-based Queries 6

Fundamental Scaling Laws for Store and Query Sensor Networks Joon Ahn and Bhaskar Krishnamachari, "Fundamental Scaling Laws for Energy. Efficient Storage and Querying in Wireless Sensor Networks", ACM Mobi. Hoc, May 2006. 7

In a Nutshell • Race between increasing supply and demand: - Energy and storage - Application-specific event and query traffic • The winner of this race determines scalability. 8

Preliminaries • N nodes deployed in a 2 D area with constant density for some time duration T • m atomic events and qi queries for the ith event, all uniformly distributed • Can create ri replicas for event i to reduce search cost (at the expense of increased replication cost) • Each transmission incurs a unit energy cost 9

Data-Centric Querying Approaches • Unstructured: expanding ring searches, random walks. • Structured: Geographic Hash Table, DIFS, DIM 10

Energy Cost Scaling • Creplication = c 1 • Csearch(unstructured) = c 2 • Csearch(structured) = c 3 STRUCTURED UNSTRUCTURED EVENT REPLICATION QUERY r : # of copies of an event N : # of nodes 11

Energy Optimization Formulation Cr(r) = c 1 S : total storage size m : the total number of events qi : the query rate for ith event ri : the number of copies of ith event Cs(r) = c 2 Cs(ri) : the expected minimum search cost of ith event Cr(ri) : the expected replication cost of ith event 12

Optimization Solution Minimizer (inactive constraint) (active constraint) The Optimized Total Cost qi : # of queries for event i N : # of nodes S : total storage size m : # of events 13

Optimal Total Cost Simplified, assuming if : q : # of queries per event N : # of nodes S : total storage size m : # of events if 14

Illustration of Energy Scaling m : # of events q : # of queries per event 15

I - Storage and Energy Scalability Results Energy Condition The energy requirement per node is bounded if and only if mq 1/2 = O(N 1/4) m : # of events q : # of queries per event N : # of nodes Storage Condition A network scales efficiently with bounded storage per node if mq 1/2 = o(N 3/4) Þ Energy constraint is stricter than storage constraint 16

II - Fixed Energy Budget Results N : # of nodes e: per-node energy budget S – successful operation region 17

III - Network Lifetime Scaling Results Network Lifetime as a function of Network Size 18

Summary • Only certain classes of applications can be sustained in arbitrarily large sensor networks. • Specifically, if mq 1/2 = O(N 1/4) for unstructured networks, and mq 2/3 = O(N 1/2) for structured networks: a. The network can operate with bounded energy and storage per node. b. The network lifetime does not decrease with network size for a given energy budget. • These results generalize in a straightforward manner to 1 D and 3 D deployments are inherently more scalable. • The results can be reinterpreted to understand how to tier sensor networks into zones with localized queries 19

Comparison of Push-Pull Schemes for Querying Shyam Kapadia and Bhaskar Krishnamachari, "Comparative Analysis of Push-Pull Query Strategies for Wireless Sensor Networks, " DCOSS, 2006. 20

Overview • Two Hybrid Push-Pull Schemes: – Geographic Hash Tables/Data Centric Storage [1] – Comb-Needles [2] [1] S. Shenker et al. , Data-centric storage in sensornets, ACM CCR, Jan 2003. [2] X. Liu et al. , Combs, needles, haystacks: balancing push and pull for discovery in large-scale sensor networks, ACM Sen. Sys '04. 21

Data Centric Storage (DCS) - sink/querier - source/event node -Hashed location where events are stored 22

Comb Needles (CN) Needles Query path (comb) s - sink/querier - source/event node 23

Model Assumptions • Square Grid of N nodes • Sink located at left-bottom corner • Events (say E) valid for an epoch – Single attribute (event type) – Uniform distribution of events across nodes • Energy measured in number of unicast transmissions • Query probability Q • Aggregation – One packet summary of all events • No modeling of collisions and contention 24

ALL-Type Query: DCS vs CN (Without Summaries) 25

ALL-Type Query: DCS vs CN (With Summaries) Θ ~ 39. 78 26

ANY-Type Query: DCS vs SCN Θlower ~ 1. 56 Θupper ~ 3. 16 27

Random Walk Queries For Heterogeneous Networks Marco Zuniga, Chen Avin, and Bhaskar Krishnamachari, "Using Heterogeneity to Enhance Random Walk-based Queries, " USC Computer Engineering Technical Report CENG-20068, August 2006. 28

Random Walk Queries For Heterogeneous Networks Marco Zuniga, Chen Avin, and Bhaskar Krishnamachari, "Using Heterogeneity to Enhance Random Walk-based Queries, " USC Computer Engineering Technical Report CENG-20068, August 2006. 29

Simple Enhancement for Heterogeneous Networks • Push event greedily to high degree nodes (local maximum) • Querier issues simple random walk 30

$Simulation Results A small fraction of high-degree cluster-heads (<10%) can provide a query cost$

Simulation Results A small fraction of high-degree cluster-heads (<10%) can provide a query cost improvement between 30% and 90%. 31

Analysis on Linear Topology k k d 32

Resistance Method • • • Hitting time (huv) : expected time taken by a random walk starting at u to reach. Commute time (Cuv) : expected time taken by a random walk starting at u to reach v and come back to u. Cuv = huv + hvu , in general huv ≠ hvu but in case of symmetry huv = hvu 1 ohm resistors Cuv = 2 m Ruv • • m : number of edges Ruv : effective resistance between u and v 33 Chandra et al. , 1989, The electrical resistance of a graph captures its commute and cover times, ACM STOC

3 Regions r(k) Region 1 Region 2 r(k) 2 k <= d k k < d <2 k d k Region 3 d <= k d 34

Region 1 [ 2 k <= d] k k d 35

Region 2 [ k < d < 2 k ] d-k r(d-k) α = 2 k-d r(d-k) 1/2 r(d-k) < 1/2 < r(d-k) 36

Region 3 [ d =k ] = d 37

Expected Hitting Time 38

$Result The first local minima for the query cost is obtained when the fraction$

Result The first local minima for the query cost is obtained when the fraction of high-degree nodes is 4/5 k, where cost is reduced by a factor of Θ(k 2) 39

Enhancing Random Walks Using Power of Choice Chen Avin and Bhaskar Krishnamachari, "The Power of Choice in Random Walks: An Empirical Study, " 9 th ACM/IEEE International Symposium on Modeling, Analysis and Simulation of Wireless and Mobile Systems, (MSWi. M), Malaga, Spain, October 2006. (Best Paper Award) 40

Cover Time Visit Load 41

Thanks 42