Robustness Prabal Dutta In collaboration with Jonathan Hui
Robustness Prabal Dutta In collaboration with: Jonathan Hui, David Chu, Joe Polastre, David Culler, Anish Arora, Mike Grimmer and Bob Cuenin Jan 13, 2005 1
Robustness – Is it an application, a service, … full of health and strength suited to endurance rough or crude; boisterous full-bodied … or a way of thinking? Jan 13, 2005 2
Robustness – It’s all of these things and more ro bust (rō-bŭst) adj. • is said of a system that has demonstrated an ability to recover gracefully from the whole range of exceptional inputs and situations in a given environment; • one step below bulletproof; • carrying the additional connotation of elegance in addition to just careful attention to detail • compare with smart; opposite is brittle Have we been building smart dust or brittle dust? Jan 13, 2005 3
Motivation • Designing sensor networks for military apps – Harsh environments – Little or no ongoing physical access – Large scale • Some “common” wisdom – Every touch breaks – Human-in-the-loop operations are not scalable • Calibration • Manual reprogramming – The real world is a hostile place • Weather • Terrain • Animals • Of course, there is much more… – Self-Stabilization (important literature) – “Six-Sigma Approach to Robust Design Jan 13, 2005 4
Motivation – DARPA NEST Extreme Scale Project • Multi-University effort led by Ohio State • Goal: Detection, Classification, and Tracking of Civilians, Soldiers, and Vehicles • Size: 10, 000 nodes (objective); ~1500 (deployed) Jan 13, 2005 5
Physical Robustness • Considerations – Weather-proofing • • Water (battery shorts) Solar (over-heated electronics) Wind (false detections) Shock (batteries) – Reducing Human-in-the-loop • One-touch • One-glance • One-listen – Modular • Antennas • Batteries • Stackable – Self-Correcting – Tamper Proof (FIPS-140) Jan 13, 2005 6
Physical Robustness : e. Xtreme Scale Mote (XSM) Jan 13, 2005 7
Deluge (Jonathan Hui) • Reliably disseminate large objects (i. e. size >> RAM) over a multi-hop sensor network from few to many nodes. • Epidemic propagation – Continuous propagation effort by advertising – Reaches nodes with intermittent connectivity (c. f. GDI) – Will always find a path if it exists : Very ROBUST • Isolated bootloader – Trusted code guaranteed to execute on reset • Golden Image – Trusted Tiny. OS app write-protected in external flash • Rollback gesture – Reset node to Golden Image by resetting the node several times • Data Integrity – PC-generated CRCs on program images and datastructures Jan 13, 2005 8
Robust Wireless Multi-hop Network Reprogramming • Wireless multi-hop programming is extremely useful… • But what happens if the program image is bad? • Manually reprogramming 10, 000 nodes is impossible! • Current approaches provide robust dissemination but no mechanism for recovering from Byzantine programs Jan 13, 2005 9
Recoverability through the Grenade Timer • No hardware protection • Basic idea presented by Stajano and Anderson • Once started – You can’t turn it off – You can only speed it up • Our implementation: Jan 13, 2005 10
ROSEBUDS • Work with David Chu and Jonathan Hui • ROSEBUDS: – Recovery-Oriented (Network is recoverable) – Security-Enabled (Program Dissemination is Secure) – Broadcast Using a Dissemination Service (Deluge) • Security Goals – – Nodes only accepts signed objects Compromised node cannot be used to violate SG 1 Incremental authentication (no buffering needed) Delay-tolerant (no time synchronization) Jan 13, 2005 11
ROSEBUDS • Implementation – – – – – • Components: Nodes, (Owner’s) Server, Factory assigns node id (IEEE OUI + serial #) Node generates ECC keys, gives pub key to Server Factory signs [id, ECC pub key] at mfg time Node preloaded w/ id, cert, Server RSA pub key Server queries network for object version Creates new package with version + 1 Performs Object Transmission Security Overhead: ~ 14% more octets, larger packets 0 Head Hash Sign 1 Head Data Hash 2 Head Data Hash n-1 Head Data Hash n Head Data Nonce Crypto Suite – SHA-1 for hash (upper 64 -bits) : ~ 13 ms/hash – RSA-1024 for signatures: ~ 1. 5 s/check – ECCDH for node pair-wise key-exchange: ~ 1 -2 min/key exchange • Status: prototype implementation of security but not yet integrated with dissemination service Jan 13, 2005 12
Future Work - Trio • Build on – XSM (sensors, signal cond) – Telos (MCU, radio, flash) • Remove – AA alkaline batteries – XSM Antenna • Solar Cell Integrated Antenna (PIFA) Push buttons Add – Telos-XSM Interface board • GPIO, ADC, I 2 C • Prometheus (solar cell and recharging circuit) – – • Lithium batteries Super capacitor Humidity sensor Acoustic wakeup Expose – Pushbuttons, LEDs • In Support of – “Permanent” deployments – Weather resistance – Exceptional event detection Jan 13, 2005 802. 15. 4 Radio USB Port Telos Interface Board XSM Lithium Battery Super Capacitor Trio comes from: - size 3”x 3” - 3 PCBs 13 - cube-shaped
Challenge - Robust Signal Processing • Are traditional techniques appropriate? – Signal variability – Limited computational resources – Non-optimal Space. Time-Message complexity – Non-Gaussian noise (frequently) – Frequency-domain analysis – Wavelets – Bayesian frameworks – Particle filtering The real world is its own best model. - Rodney A. Brooks Jan 13, 2005 14
Challenge - Robust Parameter Calibration • Evolution of local techniques – – Hard-coded constants (*. h files) Service-specific (routing) Config service (//!! …) SNMS • Challenge is robust, distributed, cross-layer calibration and tuning – Example: shower with two knobs Jan 13, 2005 Service A Service B Service C 15
Conclusions Recall • ro bust (rō-bŭst) adj. – • • is said of a system that has demonstrated an ability to recover gracefully from the whole range of exceptional inputs and situations in a given environment; Robustness is a global attribute (weakest link problem) Robustness opportunities: sensors, modules, packaging, signal processing, network algorithms, middleware services, design methodologies, and a way of thinking Robustness is about grace under duress Jan 13, 2005 16
Discussion Jan 13, 2005 17
Conclusions and Future Work • Improve (or obviate) sensor wakeup circuits – Lower false-alarm rate – Low-power (zero-power? ) wakeup • Reduce sensing power (op amp FET ASIC) • Decrease signal processing power consumption – Consider space, time, message (and energy) complexity Jan 13, 2005 18
Sensor Suite • Passive infrared – – – Long range (15 m) Low power (10 s of micro Watts) Wide FOV (360 degrees with 4 sensors) Gain: 80 d. B Wakeup • Microphone – – LPF: fc = 100 Hz – 10 k. Hz HPF: fc = 20 Hz – 4. 7 k. Hz Gain: 40 d. B – 80 d. B (100 -8300) Wakeup • Magnetometer – High power, long startup latency – Gain: 86 d. B (20, 000) Jan 13, 2005 19
The e. Xtreme Scale Mote • Key Differences between XSM and MICA 2 – Low-power Sensors – Grenade Timer – Radio Performance Jan 13, 2005 20
Hardware Evolution Telos = Low-power CPU + 802. 15. 4 Radio + Easy to use Sleep-Wakeup-Active MICAz MICA 2 CC 1000 + 802. 15. 4 Radio Sleep-Wakeup-Active Jan 13, 2005 XSM MICA 2 + Improved RF + Low-power sensing + Recoverability Passive Vigilance-Wakeup-Active XSM 2 XSM + Improvements + Bug Fixes 21
Genesis: The Case for a New Platform • Cost – Eliminate expensive parts from BOM – Eliminate unnecessary parts from BOM – Optimize for large quantity manufacturing and use • Network Scale by 100 x (10, 000 nodes) – Reliability: How to deal with 10 K nodes with bad image • Detection range by 6 x (10 m) – New sensors to satisfy range/density/cost tradeoff • Lifetime 8 x (720 hrs 1000 hrs) – – Magnetometer: Tstartup = 40 ms, Pss = 18 m. W UWB Radar: Tstartup = 30 s, Pss = 45 m. W Optimistic lifetime: 6000 m. Wh / 63 m. W < 100 hrs Must lower power • Radio – Fix anisotropic radiation and impedance mismatch Jan 13, 2005 22
Ex. Scal Magnetic Target Detected Radar Target Detected Jan 13, 2005 23
Requirements (of the hardware platform) • Functional – Detection, Classification (and Tracking) of: Civilians, Soldiers and Vehicles • Reliability – Recoverable: Even from a Byzantine program image • Performance – – Intrusion Rate: 10 intrusions per day Lifetime: 1000 hrs of continuous operation (> 30 days) Latency: 10 – 30 seconds Coverage: 10 km^2 (could not meet given constraints) • Supportability – Adaptive: Dynamic reconfiguration of thresholds, etc. Jan 13, 2005 24
- Slides: 24