An EventBased Data Fusion Algorithm for Smart Cities

An Event-Based Data Fusion Algorithm for Smart Cities Avinash Kalyanaraman Kamin Whitehouse

Sensors, sensors everywhere !!! Source: Cisco

Static & personal sensing systems need to be fused to capture complete story

Motivating Examples

Example-1: Energy Footprint (Time: 12: 30: 03 PM, Appliance: Living. Room Lamp, Power: 150 W) (Time: 12: 30: 14 PM, Appliance : Living. Room Lamp, User : Bob) (Time: 12: 30: 08 PM, Appliance: Living. Room Lamp, User: Bob, Power: 150 W)

Example-2: Automatic dietary monitoring (Time: 12: 30: 03 PM, Activity: Cutting, Entity: Apple) (Time: 12: 30: 08 PM, Activity: Cutting User: Bob, Entity: Apple) (Time: 12: 30: 14 PM, Activity: Cutting, User : Bob)

Fusion Properties 1. Two systems observe different attributes of same event (e. g. power vs identity for same Light ON event) 1. Non time sync 1. One system has notion of identity with per-identity event ordering

Fusion Properties 4. Other system has global event ordering (e. g. NILM) 5. False positives and false negatives occur 6. Only similar events of the two systems can be matched (e. g. NILM microwave must match with gesture microwave event, and not Fridge event)

Different from traditional Bipartite Matching Algorithms Simply maximizes matches without any notion of identity Results in crossing matches

Algorithm i) Naive MHT ii) Naive MHT-with optimal pruning iii) The Divide and Conquer approach

Iteration 1 : Naive MHT

Iteration 1 : Naive MHT A Hypothesis Cost of an association = timestamp difference e. g. |d - a| , |e - a|. Cost of hypothesis = total cost of all associations

Iteration 1 : Naive MHT (contd) Finally, choose hypothesis having maximum associations at minimum cost

Iteration 1 : Naive MHT (contd) Exponential explosion Finally, choose hypothesis having maximum associations at minimum cost

Iteration 2: Naive MHT with optimal pruning State of hypothesis Intuition: Two hypotheses ending in same state will behave identically moving forward. Optimal pruning condition: “If two hypotheses end in same state, choose the better one” Maintains: O(|P 1| * |P 2| * … |Pn|) hypotheses, |Pi| = # events of the i-th identity in SS 2

Iteration 3 : Divide and Conquer Approach Hypotheses { [(a→d) , (b→e), (c→ɸ)] , [(a→d) , (b→f), (c→ɸ)] , [(a→e) , (b→d), (c→ɸ)] … } worse than [(a→d) , (b→e), (c→f)] Any hypotheses forked from the above set will be worse than those forked from [(a→d) , (b→e), (c→f)] No event >= (w), can match with an event <= (f) Two independent sub-problems: [SS 1 = {a, b, c}, SS 2={d, e, f}] and [SS 1 = {w, y}, SS 2={x, z}]

Iteration 3 : Divide and Conquer Approach “How to partition the given matching problem into sub-problems that can be solved independently? ” Maintains: O(|P 1| * |P 2| * … |Pn|) hypotheses, |Pi| = # events of the i-th identity in the largest sub-problem

Experimental setup: System evaluation Tracking Algorithm (Time: 12: 30: 03 PM, Doorway : Bedroom, Height : 180 cm) (Time: 12: 30: 17 PM, Doorway : Bedroom, Person : Bob, Direction : IN) (EO, GT mappings) : 12: 30: 03 → 12: 30: 17. . . 12: 30: 03 : : Bob, IN

Experimental Setup Doorjamb-like setup Timeline-1 : (Timestamp, person, doorway, direction) Timeline-2 (Doorjamb-like): Empirically modified timeline-1, studying effect of skew, FP and FN

Experimental Setup (contd) Skew : -10 to +10 s FP : 20% FP, according to Uniform distribution FN : 10% FN, according to Uniform distribution Metric : Matching accuracy = % of phone transitions correctly matched with its Doorjamb event Baselines: Greedy closest match (used by Saha et al) Greedy min cost (used by Hnat et al)

Evaluation a. Skew Greedy algorithms suffer with larger skews because higher likelihood of local optima b. Skew + FP G-opt suffers from lowest accuracy variance despite FP, FN and time skew. c. Skew + FN

Conclusion Present three diverse use-cases for static-personal sensing system fusion G-opt Algorithm : MHT + optimal pruning + Divide And Conquer Maintains: O(|P 1| * |P 2| * … |Pn|) hypotheses, |Pi| = # events of the i-th identity in the largest sub-problem G-opt has lesser accuracy variance than greedy algorithms despite FP, FN and skew More important with time as more diverse sensors get deployed for smartcity applications