Declarative Networking Extensible Networks with Declarative Queries Boon
Declarative Networking: Extensible Networks with Declarative Queries Boon Thau Loo University of California, Berkeley 1
Era of change for the Internet “in the thirty-odd years since its invention, new uses and abuses, …. , are pushing the Internet into realms that its original design neither anticipated nor easily accommodates…. . ” Overcoming Barriers to Disruptive Innovation in Networking, NSF Workshop Report ‘ 05 2
Efforts at Internet Innovation Evolution: Overlay Networks n n n Commercial (Akamai, VPN, MS Exchange servers) P 2 P (filesharing, telephony) Research prototypes on testbed (Planet. Lab) Revolution: Clean slate design n n NSF Future Internet Design (FIND) program Overlay NSF Global Environment for Network Investigations (GENI) initiative Missing: software tools that can significantly Internet accelerate Internet innovation 3
Approach: Declarative Networking A declarative framework for networks: n n n Declarative language: “ask for what you want, not how to implement it” Declarative specifications of networks, compiled to distributed dataflows Runtime engine to execute distributed dataflows Observation: Recursive queries are a natural fit for routing 4
P 2 Declarative Networking System http: //p 2. cs. berkeley. edu P 2 Declarative Networking System Network Specifications as Queries Query Planner Dataflow Engine Dataflow Network Protocols 5
The Case for Declarative Ease of programming: n n n Compact and high-level representation of protocols Orders of magnitude reduction in code size Easy customization Safety: n n Queries are “sandboxed” within query processor Potential for static analysis techniques on safety What about efficiency? n n n No fundamental overhead when executing standard routing protocols Application of well-studied query optimizations Note: Same question was asked of relational databases in the 70’s. 6
Main Contributions Declarative Routing [Hot. Nets ’ 04, SIGCOMM ’ 05]: n Extensible Routers (balance of flexibility, efficiency and safety). Declarative Overlays [SOSP ’ 05]: n Rapid prototyping of new overlay networks Database Fundamentals [SIGMOD ‘ 06]: n n n Network specific query language and semantics Distributed recursive query execution strategies Query Optimizations, classical and new 7
A Breadth of Use Cases Implemented to date: n n n Textbook routing protocols (3 -8 lines, UCB/Wisconsin) Chord DHT overlay routing (47 lines, UCB/IRB) Narada mesh (16 lines, UCB/Intel) Distributed Gnutella/Web crawlers (Dataflow, UCB) Lamport/Chandy snapshots (20 lines, Intel/Rice/MPI) Paxos distributed consensus (44 lines, Harvard) In Progress: n n OSPF routing (UCB) Distributed Junction Tree statistical inference (UCB) 8
Outline Background The Connection: Routing as a Query n n Execution Model Path-Vector Protocol Example w Query specification protocol implementation n More Examples Realizing the Connection n P 2: Declarative Routing Engine Beyond routing: Declarative Overlays Conclusion 9
Traditional Router Routing Protocol Control Plane Neighbor Table Forwarding updates Table updates Forwarding Plane Packets Traditional Router 10
Review: Path Vector Protocol path=[a, b, c, d] a path=[b, c, d] path=[c, d] b b advertises [b, c, d] c d c advertises [c, d] Advertisement: entire path to a destination Each node receives advertisement, add itself to path and forward to neighbors 11
Declarative Router P 2 Engine Declarative Queries Control Plane Routing Protocol Neighbor Table Forwarding updates Table updates Forwarding Plane Packets Declarative Traditional Router 12
Introduction to Datalog rule syntax: <result> <condition 1>, <condition 2>, … , <condition. N>. Head Body Types of conditions is body: n n Input tables: link(src, dst) predicate Arithmetic and list operations Head is an output table n Recursive rules: result of head in rule body 13
All-Pairs Reachability R 1: reachable(S, D) link(S, D) R 2: reachable(S, D) link(S, Z), reachable(Z, D) “For all nodes S, D, is a link from node a to node b” link(a, b) – “there If there is a link from S to D, then S can reach D”. reachable(a, b) – “node a can reach node b” Input: link(source, destination) Output: reachable(source, destination) 14
All-Pairs Reachability R 1: reachable(S, D) link(S, D) R 2: reachable(S, D) link(S, Z), reachable(Z, D) “For all nodes S, D and Z, If there is a link from S to Z, AND Z can reach D, then S can reach D”. Input: link(source, destination) Output: reachable(source, destination) 15
Towards Network Datalog Specify tuple placement n Value-based partitioning of tables Tuples to be combined are co-located n Rule rewrite ensures body is always single-site All communication is among neighbors n n No multihop routing during basic rule execution Enforced via simple syntactic restrictions 16
Network Datalog Location Specifier “@S” R 1: reachable(@S, D) link(@S, D) R 2: reachable(@S, D) link(@S, Z), reachable(@Z, D) Query: reachable(@a, N) reachable(@M, N) link Input table: Output table: All-Pairs Reachability link @S D @a b @b c @c b @d c @b a @c d a b c d reachable @S D @a b @a c @b @a d @b @S D @b a Query: reachable(@a, N) @c a @d a c @c b @d d @c d b @d 17 c
Path Vector in Network Datalog R 1: path(@S, D, P) link(@S, D), P=(S, D). R 2: path(@S, D, P) link(@Z, S), path(@Z, D, P 2), P=S P 2. Query: path(@S, D, P) Add S to front of P 2 Input: link(@source, destination) Query output: path(@source, destination, path. Vector) 18
Query Execution R 1: path(@S, D, P) link(@S, D), P=(S, D). R 2: path(@S, D, P) link(@Z, S), path(@Z, D, P 2), P=S P 2. Query: path(@a, d, P, C) link Neighbor table: link D @S D @a b @b c @c b @d c @b a @c d path @S link @S a Forwarding table: link D P @S b c path D P d @S D P @c d [c, d] 19
Query Execution R 1: path(@S, D, P) link(@S, D), P=(S, D). R 2: path(@S, D, P) link(@Z, S), path(@Z, D, P 2), P=S P 2. Query: path(@a, d, P, C) Matching variable Z = “Join” link Neighbor @S D table: Communication @a b link @S link D link @S patterns are identical to @b c @c b @d @b path a those in the actual vector @c protocol d a b path(@a, d, [a, b, c, d]) path Forwarding table: @S D @a d @S PP [a, b, c, d] D c path(@b, d, [b, c, d]) path D c d path @S D PP @S D P @b d [b, c, d] @c d [c, d] 20
Sanity Check All-pairs shortest latency path query: n n n Query convergence time: proportional to diameter of the network. Same as hand-coded PV. Per-node communication overhead: Increases linearly with the number of nodes Same scalability trends compared with PV/DV protocols 21
Outline Background The Connection: Routing as a Query n n Execution Model Path-Vector Protocol Example w Query specifications protocol implementation n Example Queries Realizing the Connection Declarative Overlays Conclusion 22
Example Routing Queries Best-Path Routing Distance Vector Dynamic Source Routing Policy Decisions Qo. S-based Routing Link-state Multicast Overlays (Single-Source & CBT) Takeaways: • Compact, natural representation • Customization: easy to make modifications to get new protocols • Connection between query optimization and protocols 23
All-pairs All-paths R 1: path(@S, D, P , C) link(@S, D, C), P=(S, D). R 2: path(@S, D, P , C) link(@S, Z, C 1), path(Z, D, P 2, C 2), C=C 1+C 2, P=S P 2. Query: path(@S, D, P, C) 24
All-pairs Best-path R 1: path(@S, D, P, C) link(@S, D, C), P=(S, D). R 2: path(@S, D, P, C) link(@S, Z, C 1), path(@Z, D, P 2, C 2), C=C 1+C 2, P=S P 2. R 3: best. Path. Cost(@S, D, min<C>) path(@S, D, Z, C). R 4: best. Path(@S, D, Z, C) best. Path. Cost(@S, D, C), path(@S, D, P, C). Query: best. Path(@S, D, P, C) 25
Customizable Best-Paths R 1: path(@S, D, P, C) link(@S, D, C), P=(S, D). R 2: path(@S, D, P, C) link(@S, Z, C 1), path(@Z, D, P 2, C 2), C=FN(C 1, C 2), P=S P 2. R 3: best. Path. Cost(@S, D, AGG<C>) path(@S, D, Z, C). R 4: best. Path(@S, D, Z, C) best. Path. Cost(@S, D, C), path(@S, D, P, C). Query: best. Path(@S, D, P, C) Customizing C, AGG and FN: lowest RTT, lowest loss rate, highest capacity, best-k 26
All-pairs All-paths R 1: path(@S, D, P , C) link(@S, D, C) , P=(S, D). R 2: path(@S, D, P , C) link(@S, Z, C 1), path(@Z, D, P 2 , C 2), C=C 1+C 2, P=S P 2. Query: path(@S, D, P , C) 27
Distance Vector R 1: path(@S, D, D, C) link(@S, D, C). R 2: path(@S, D, Z , C) link(@S, Z, C 1), path(@Z, D, W , C 2), C=C 1+C 2 R 3: shortest. Length(@S, D, min<C>) path(@S, D, Z, C). R 4: next. Hop(@S, D, Z, C), shortest. Length(@S, D, C). Query: next. Hop (@S, D, Z , C) Count to Infinity problem? 28
Distance Vector with Split Horizon R 1: path(@S, D, D, C) link(@S, D, C) R 2: path(@S, D, Z, C) link(@S, Z, C 1), path(@Z, D, W, C 2), C=C 1+C 2, W!=S R 3: shortest. Length(@S, D, min<C>) path(@S, D, Z, C). R 4: next. Hop(@S, D, Z, C), shortest. Length(@S, D, C). Query: next. Hop(@S, D, Z, C) 29
Distance Vector with Poisoned Reverse R 1: path(@S, D, D, C) link(@S, D, C) R 2: path(@S, D, Z, C) link(@S, Z, C 1), path(@Z, D, W, C 2), C=C 1+C 2, W!=S R 3: path(@S, D, Z, C) link(@S, Z, C 1), path(@Z, D, W, C 2), C= , W=S R 4: shortest. Length(@S, D, min<C>) path(@S, D, Z, C). R 5: next. Hop(@S, D, Z, C), shortest. Length(@S, D, C). Query: next. Hop(@S, D, Z, C) 30
All-pairs All-Paths R 1: path(@S, D, P, C) link(@S, D, C), P= (S, D). R 2: path(@S, D, P, C) link(@S, Z, C 1), path(@Z, D, P 2, C 2), C=C 1+C 2, P=S P 2. Query: path(@S, D, P, C) 31
Dynamic Source Routing R 1: path(@S, D, P, C) link(@S, D, C), P= (S, D). R 2: path(@S, D, P, C) link(@Z, D, C 2), path(@S, Z, P 1, C 1), C=C 1+C 2, P=P P=S P 1 D. 2. Query: path(@S, D, P, C) Predicate reordering: path vector protocol source routing dynamic 32
Other Routing Examples Best-Path Routing Distance Vector Dynamic Source Routing Policy Decisions Qo. S-based Routing Link-state Multicast Overlays (Single-Source & CBT) 33
Outline Background The Connection: Routing as a Query Realizing the Connection n n Dataflow Generation and Execution Recursive Query Processing Optimizations Semantics in a dynamic network Beyond routing: Declarative Overlays Conclusion 34
Dataflow Graph Strands Network Out Network In Messages Single P 2 Nodes in dataflow graph (“elements”): n n n Network elements (send/recv, cc, retry, rate limitation) Flow elements (mux, demux, queues) Relational operators (selects, projects, joins, aggregates) 35
Dataflow Strand Elements Input Tuples Element 1 Element 2 … Elementn Output Tuples Input: Incoming network messages, local table changes, local timer events Condition: Process input tuple using strand elements Output: Outgoing network messages, local table updates 36
Rule Dataflow “Strands” R 2: path(@S, D, P) link(@S, Z), path(@Z, D, P 2), P=S P 2. 37
Localization Rewrite Rules may have body predicates at different locations: R 2: path(@S, D, P) link(@S, Z), path(@Z, D, P 2), P=S P 2. Matching variable Z = “Join” Rewritten rules: R 2 a: link. D(S, @D) link(@S, D) R 2 b: path(@S, D, P) link. D(S, @Z), path(@Z, D, P 2), P=S P 2. Matching variable Z = “Join” 38
Dataflow Strand Generation R 2 b: path(@S, D, P) link. D(S, @Z), path(@Z, D, P 2), P=S P 2. Strand Elements Join path. Z = link. D. Z Project path(S, D, P) Send to path. S link. D Join link. D. Z = path. Z Project path(S, D, P) Send to path. S Network In path 39
Recursive Query Evaluation Semi-naïve evaluation: n n Iterations (rounds) of synchronous computation Results from iteration ith used in (i+1)th 10 9 8 7 6 5 4 3 2 1 Link Table 9 7 3 -hop 4 8 2 -hop 1 -hop Path Table 1 2 5 0 3 6 Network Problem: Unpredictable delays and failures 40 10
Pipelined Semi-naïve (PSN) Fully-asynchronous evaluation: n n Computed tuples in any iteration pipelined to next iteration Natural for distributed dataflows 10 9 6 3 8 5 2 7 4 1 Link Table Path Table 9 7 4 2 1 5 8 Relaxation of 0 semi-naïve 3 6 Network 41 10
Pipelined Evaluation Challenges: n n Does PSN produce the correct answer? Is PSN bandwidth efficient? w I. e. does it make the minimum number of inferences? Duplicate avoidance: local timestamps Theorems: n n RSSN(p) = RSPSN(p), where RS is results set No repeated inferences in computing RSPSN(p) p(x, z) : - p 1(x, y), p 2(y, z), …, pn(y, z), q(z, w) recursive w. r. t. p 42
Outline Background The Connection: Routing as a Query P 2 Declarative Networking System n n n Dataflow Generation and Execution Recursive Query Processing Optimizations Beyond routing: Declarative Overlays Conclusion 43
Overview of Optimizations Traditional: evaluate in the NW context n n n Aggregate Selections Magic Sets rewrite Predicate Reordering PV/DV DSR New: motivated by NW context n Multi-query optimizations: w Query Results caching w Opportunistic message sharing n Cost-based optimizations (work-in-progress) w Neighborhood density function w Hybrid rewrites Zone Routing Protocol 44
Aggregate Selections Prune communication using running state of monotonic aggregate n n Avoid sending tuples that do not affect value of agg E. g. , shortest-paths query Challenge in distributed setting: n n Out-of-order (in terms of monotonic aggregate) arrival of tuples Solution: Periodic aggregate selections w Buffer up tuples, periodically send best-agg tuples 45
Aggregate Selections Evaluation P 2 implementation of routing protocols on Emulab (100 nodes) All-pairs best-path queries (with aggregate selections) Aggregate Selections reduces communication overhead n More effective when link metric correlated with network delay Periodic AS reduces communication overhead further 46
Outline Background The Connection: Routing as a Query Realizing the Connection n P 2: Declarative Routing Engine Beyond routing: Declarative Overlays Conclusion 47
Recall: Declarative Routing P 2 Engine Declarative Queries Control Plane Neighbor Table updates Forwarding Plane Packets Declarative Router 48
Declarative Overlays P 2 Engine Declarative Queries Control and forwarding Plane Packets Application level Internet Default Internet Routing Declarative Overlay Node 49
Declarative Overlays More challenging to specify: n n Not just querying for routes using input links Rules for generating overlay topology Message delivery, acknowledgements, failure detection, timeouts, periodic probes, etc… Extensive use of timer-based event predicates: ping(@D, S) : - periodic(@S, 10), link(@S, D) 50
P 2 -Chord Routing, including: n Multiple successors n Stabilization n Optimized finger maintenance n Failure detection 47 rules 13 table definitions MIT-Chord: x 100 more code Another example: n Narada mesh in 16 rules 10 pt font 51
Actual Chord Lookup Dataflow 52
P 2 -Chord Evaluation P 2 nodes running Chord on 100 Emulab nodes: n n n Logarithmic lookup hop-count and state (“correct”) Median lookup latency: 1 -1. 5 s BW-efficient: 300 bytes/s/node 53
Moving up the stack Querying the overlay: n n Routing tables are “views” to be queried Queries on route resilience, network diameter, path length Recursive queries for network discovery: n n Distributed Gnutella crawler on Planet. Lab [IPTPS ‘ 03] Distributed web crawler over DHTs on Planet. Lab Oct ’ 03 distributed crawl: 100, 000 nodes, 20 million files 54
Outline Background The Connection: Routing as a Query Realizing the Connection Beyond routing: Declarative Overlays Conclusion 55
A Sampling of Related Work Databases n n Recursive queries: software analysis, trust management, distributed systems diagnosis Opportunities : Computational biology, data integration, sensor networks Networking n n XORP – Extensible Routers High-level routing specifications w Meta-Routing, Routing logic 56
Future Directions Declarative Networking: n n n Static checks on desirable network properties Automatic cost-based optimizations Component-based network abstractions Core Internet Infrastructure n n Declarative specifications of ISP configurations P 2 deployment in routers 57
Distributed Data Management on Declarative Networks Data Management Applications SQL, XML, Datalog Distributed Queries P 2 P Search, network monitoring, P 2 P data integration, collaborative filtering, content distribution networks… Distributed Algorithms P 2: Declarative Networks Consensus (Harvard), 2 PC, Byzantine, Snapshots (Rice/Intel), Replication Customized routes, DHTs, Flood, Gossip, Multicast Mesh Run-time cross-layer optimizations: n n Reoptimize data placement and queries Reconfigure networks based on data and query workloads 58
Other Work Internet-Scale Query Processing n n PIER – Distributed query processor on DHTs http: //pier. cs. berkeley. edu [VLDB 2003, CIDR 2005] P 2 P Search Infrastructures n n P 2 P Web Search and Indexing [IPTPS 2003] Gnutella measurements on Planet. Lab [IPTPS 2004] w Distributed Gnutella crawler and monitoring n Hybrid P 2 P search [VLDB 2004] 59
Contributions and Summary P 2 Declarative Networking System n Declarative Routing Engine w Extensible routing infrastructure n Declarative Overlays w Rapid prototyping overlay networks n Database fundamentals w Query language w New distributed query execution strategies and optimizations w Semantics in dynamic networks Period of flux in Internet research n Declarative Networks can play an important role 60
Thank You 61
- Slides: 61