Growth Codes Maximizing Sensor Network Data Persistence Abhinav
Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science, Columbia University Jon Feldman Google Labs DNA Research Group 1
Outline q Problem Description q Solution Approach: Growth Codes q Experiments and Simulations q Conclusions and Ongoing work ACM Sigcomm 2006 2
Background: A generic sensor network Sensor Nodes Sink(s) x 1 Sensed Data x 9 x 2 x 10 Data follows multi-hop path to sink(s) A few node failures can break the data flow x 12 x 3 x 11 x 8 x 5 x 6 x 13 x 7 Generic Aim: Collect data from all nodes at sink(s) x 4 ACM Sigcomm 2006 3
Specific Context: Disaster Scenarios q e. g. , Monitoring earthquakes, fires, floods, war zones q Problems in this setting Congestion near sink(s) § All nodes simultaneously forward data § Overwhelm sink(s) capacity Virtual queue: Congestion near sink ACM Sigcomm 2006 4
Specific Context: Disaster Scenarios - 2 q Problems in this setting Network Collapsing: nodes failing rapidly § Pre-computed routes may fail § Data from failed nodes can be lost § Data Recovery from subset of nodes acceptable ACM Sigcomm 2006 5
Challenges q Networking Challenges: q Coding Challenges: q Disaster scenarios: feedback often infeasible Frequent disruptions to routing tree if setup Difficult to predict node failures: sink locations unknown, surviving routes unknown Difficult to synchronize nodes’ clocks Data source distributed (among all sensor nodes) Prior approaches (Turbo codes, LDPC codes) aim at fast complete recovery Sensor nodes have very limited memory, CPU, bandwidth ACM Sigcomm 2006 6
Data Objectives Persistence Fraction of data that eventually reaches the sink(s) Sink Preserve data from failed sensor nodes x x 8 x 3 x Deliver data to 6 x 12 2 x 9 + x 10 6 of 10 symbols reach sink. 60% sink(s) Persistence as fast =as possible x 11 = x 5 Maximize Data Persistence ACM Sigcomm 2006 7
Limitations of Previous Work q Channel Coding based (e. g. Turbo Codes [Anderson-ISIT 94], LT Codes [Luby 02]) Aim for complete recovery in minimum time Difficult to implement with distributed sources q Routing-based (e. g. Directed Diffusion [Govindan 00], Cougar [Yao-SIGMOD 02]) Conjecture: Too fragile (disrupted easily) for disaster scenarios ACM Sigcomm 2006 8
Our Approach q Two main ideas Randomized routing and replication § Avoid actively maintaining routes § Replicate data to increase data survival Distributed channel codes (Growth Codes) § Expedite data delivery & survivability First (to our knowledge) ACM Sigcomm 2006 distributed channel codes 9
Outline q Problem Description q Our Solution: Growth Codes q Experiments and Simulations q Conclusions and Ongoing work ACM Sigcomm 2006 10
Network Assumptions 4 5 3 2 S 1 6 S 7 q q N node sensor network Limited storage: each node stores small # of data units Large storage at sink(s): sink receives codewords from random node(s) All sensed data assumed independent (no source coding) ACM Sigcomm 2006 11
High Level View of the Protocol 4 1 2 3 Nodes send data at random times (Current implementation: exponentially distributed timers) ACM Sigcomm 2006 12
High Level View of the Protocol (2) Symbols 4 Degree 1 codewords 1 2 0 Degree 2 codeword Even if node 3 fails Sender picks a random symbol Node 3’s data survives XORs it with its own symbol K 1 3 K 3 After time K 1, nodes start sending degree 2 codewords ACM Sigcomm 2006 13 K 2
High Level View of the Protocol (3) q q q After time K 1, nodes start sending degree 2 codewords After time K 2, nodes start sending degree 3 codewords. . After time Ki, nodes start sending degree i+1 codewords What are good values for {Ki}? 0 Please refer to our paper Note: No need to tightly synchronize clocks (Times Ki can be out of sync at different nodes) ACM Sigcomm 2006 14 K 1 K 3 K 2
The Intuition behind Growth Codes Codewords When very few symbols decoded Easy to decode low degree codewords Set of symbols decoded at Sink time ACM Sigcomm 2006 15
The Intuition behind Growth Codes(2) Codewords When significant number of symbols decoded Low degree codewords often redundant Set of symbols Higher degree codewords more likely to be useful decoded at Sink ACM Sigcomm 2006 16
Outline q Problem Description q Growth Codes q Simulations and Experiments q Conclusions and Ongoing work ACM Sigcomm 2006 17
Simulations/Experiments: Compare data persistence of various approaches 1. Simulations: q q 2. Centralized Setting: compare GC with other channel coding schemes Distributed Simulation: assess large-scale performance of coding vs no coding Experiments on motes: q q Compare time of complete recovery for GC vs routing Measure resilience to node failures ACM Sigcomm 2006 18
Comparison with various coding schemes (N = 1500) Centralized Simulation (to compare with other channel coding schemes for which only centralized versions exist) Single source, single sink Source generates random codewords q. No according coding is fast beginning: slowdown explained via toincoding scheme (GC, is. Soliton) Coupon Collector’s problem Sink Zero failure rate q. Soliton/ R-Soliton: poor partial recovery (reason: high 1 degree codewords sent too early) q. Growth Codes closest to theoretical upper bound Sourceright degree at the right time) (reason: 19 ACM Sigcomm 2006
Growth Codes vs No Coding (Varying N) Distributed Simulation (to assess the performance gain of coding) N sources, single sink Random graph topology (avg degree 10) Sink receives 1 codeword per time unit q. Complete recovery takes: q. O(N log. N) time without coding (Coupon Collector’s effect) q. Linear time with Growth Codes q. Soliton/R-Soliton: cannot compare in a distributed setup ACM Sigcomm 2006 20
Experiments with (micaz) motes (to measure data persistence with time) GC vs Tiny. OS’s “Multi. Hop” routing protocol q No routing state at time 0 (scenario where sensor nodes are deployed rapidly) q Experimental Topology S q“Multi. Hop” for persistence: takes long time to complete route setup q. Comparison with GC simulator validates simulator performance ACM Sigcomm 2006 21
Motes experiments: Resilience to node failures Nodes generate data every 300 seconds q 3 nodes fail just after 3 rd data generation q “Multi. Hop” sets up routing 3 random nodes fail Nodes generate S data 0 600 300 Experimental Topology ACM Sigcomm 2006 Nodes send data 22 to sink 900 “Multi. Hop” repairs routes
Motes experiments: Resilience to node failures q 1 st generation: GC faster, MH takes time to setup routes q 2 nd generation: routing already setup, MH very fast q 3 rd generation: MH needs to repair routes “Multi. Hop” repairs routes “Multi. Hop” sets up routing 3 random nodes fail Nodes generate data 0 600 300 ACM Sigcomm 2006 23 Nodes send data to sink 900
Other Results: Please refer to our paper q Good values for K 1, K 2, … q More simulations/experiments Various topologies Other failure scenarios q Implementation details: Memory usage at sensor nodes: how it affects performance How to handle periodic data generation How to reduce overhead of coefficients ACM Sigcomm 2006 24
Conclusions q Data persistence in sensor networks: First distributed channel codes (GC) Protocol requires minimal configuration Is robust to node failures q Simulations and experiments on micaz motes show, (compared to prior coding and routing methods) GC achieves complete recovery faster GC recovers more partial data at any time ACM Sigcomm 2006 25
Ongoing Work q Adapt Growth Codes to scenarios where sensor data is correlated q Take advantage of any available routing information (e. g. before a disaster) q Estimate network size on the fly to use in Growth Codes ACM Sigcomm 2006 26
Thanks for your patience ! For more information DNA Research Lab, Columbia University http: //dna-wsl. cs. columbia. edu/ ACM Sigcomm 2006 27
- Slides: 27