Compact forwarding Uma abordagem probabilstica para o encaminhamento

Compact forwarding : Uma abordagem probabilística para o encaminhamento de pacotes em redes orientadas a conteúdo Christian Esteve Rothenberg Exame de defesa de Doutorado, 15/12/2010 Universidade Estadual de Campinas (Unicamp)

Agenda Motivation: Beyond TCP/IP Problem: Forwarding in Content-Oriented Networks Approach: Compact Forwarding Questions Contributions Probabilistic Forwarding Principles Algorithmic techniques Enablers Applications

Background and Motivation information-centrism DONA content-centric networking TRIAD DPI NAT CDN P 2 P TCP/IP o etw rk (r ing )e n 2 Interconnecting hosts Telephony 1 Interconnecting wires New ID spaces id-loc clean-slate overlays IPv 6 middleboxes patching ROFL PSIRP tion u l vo 3 Interconnecting information

Content-oriented Networking Rethinking fundamentals ?

Problem • Data forwarding challenges of content-oriented networks – Focus on data, not on end-points • Re-think the packet forwarding plane – What is a suitable forwarding substrate for content-centric networks departing from the host-centric paradigm of IP? – Which are candidate features and enabling data structures? • Main abstraction is a flat label – Bit string representing any higher level information (e. g. content object, network link, multicast tree, endpoint ID). – How to forward packets labeled with flat (location-independent, unstructured, random looking) identifiers?

Longest IP prefix vs. flat label matching IPv 4 Dec. 81. 216. 171. 106 human Bin. 01010001 11011000 10101011 01101010 machine Aggr. IPv 6 Hex. ca 12: b 9 fa: 655 a: 0000: ac 2 f: ccef: f 0 ab Bin. 11001010 00010010 10111001 11111010 11001010 01011010 00000000 10101100 00000000 00101111 1100 111011110000 10101011 Hex. 256 -bit Bin. flat label 08090 A 0 B 0 D 0 E 0 F 10121314151718191 A 1 C 1 D 1 E 1 F 21222324262728292 B 2 C 2 D 2 E 11111010 11001010 01011010 00000000 10101100 00000000 00101111 1100 111011110000 10101011 11001010 00010010 10111001 11111010 01011010 00000000 10101100 111011110000 10101011 11001010 00010010 10111001 11111010 11001010 flat ID forwarding is expensive @ wire speed

Scale by aggregation • Global networks are able to scale by aggregating the address space so that state is needed only for each aggregate. – Scalability principle also known as information hiding. • Examples: – Public switched telephone number aggregation on geographical location – Domain Name System (DNS): Zone aggregation of hierarchical names – Hierarchical routing: Aggregation of IP addresses on address blocks • Problem: – Flat labels are not aggregatable – How to provide efficient wire-speed forwarding on a large universe (e. g. , 256 -bit) of flat identifiers? • A probabilistic approach: – Synthetic aggregation through suitable (lossy) compression methods applied to packet forwarding. – Transform the packet forwarding into a set membership problem solved by probabilistic data structures (i. e. , Bloom filter variations).

The role of Bloom and family Well-known probabilistic data structure Insert_element() • Efficient data set aggregator • False positive rate : f (memory / # elements) Suits wire-speed forwarding requirements • Low (bounded) packet processing time (time to hash) • Scarcity of high-speed memory Compact forwarding approach: Check_element() “yes” / no ? • Forwarding as a set membership-problem solved by variations of the Bloom filter probabilistic data structure • Relax algorithmic correctness: trade state for over-deliveries (bandwidth efficiency)

A probabilistic approach to packet forwarding Compact forwarding as two extreme set membership-problems solved with probabilistic data structures (i. e. , Bloom filters): SPSwitch: Is packet label X in forwarding port P? • State in the network • Large Bloom filters maintained in forwarding tables i. BFs: Is outbound link A in packet header Z? • State in the packet header • Small in-packet Bloom filter representing a source route

Compact forwarding • We can define compact forwarding as “a study of the trade-offs of in-network and in-packet forwarding methods that guarantee the correct delivery of packets in function of forwarding efficiency. ” – Definition 1: We say a forwarding method is compact if each forwarding table entry requires less than log(n)-bits per routable object in an n-dimension universe. – Definition 2: We say a forwarding method is compact if the datagram header size is of fixed size with independence of the forwarding directives included. • Notion of compactness: – Striving for minimal forwarding information base – Forwarding methods based on probabilistic data structures (i. e. one-sided error prone Bloom filter inspired data structures)

4 -dimensional solution space Routing / forwarding information in packets (i. e. packet header bits) Transport efficiency (Stretch, Bandwidth) Multicast-ready compact forwarding This Thesis packet routing Previous work Traditional unicast & compact routing Adaptation costs (e. g. signaling) State in network elements (i. e. FIB bits)

Contributions 1. Principles – Collection of generic and technical principles for scalable forwarding in content-oriented networks 2. Algorithmic techniques – Conception and application of algorithmic techniques to cope with the limitations of previous work in probabilistic data structures and deal with anomalies of probabilistic forwarding. 3. Applications – Validation of the compact forwarding methods in practical networking scenarios: • Internet-scale publish/subscribe networking • Scalable cloud data center networking • Inter-domain multicast

Principles • Principle 1: Separate routing from forwarding – Content-oriented routing protocols should be functionally separated from the forwarding elements. (Separation of concerns and evolvability) – Sub-Principle A: Generality: The forwarding methods should be generic enough with regard to the identifiers – Sub-Principle B: Simplicity: The forwarding methods should be based on simple per-packet primitives • Principle 2: Allow a flexible operation point – The forwarding system should enable the network architect/operator to find a sweet spot for a given networking scenario • Principle 3: Multicast-friendliness – The forwarding methods should provide native multicast capabilities • Principle 4: Receiver-controlled data plane security – The forwarding methods should be designed with security in mind, empowering the receiver with control over the delivered traffic

Compact forwarding Algorithmic techniques Compact forwarding as two extreme set membership-problems solved with probabilistic data structures (i. e. , Bloom filters): SPSwitch: Is packet label X in forwarding port P? • State in the network • Large Bloom filters maintained in forwarding tables i. BFs: Is outbound link A in packet header Z? • State in the packet header • Small in-packet Bloom filter representing a source route

In-network compact forwarding Bloom-filter-inspired • Efficient data set aggregator • Answers set membership questions (“yes”/no) – No false negatives: Inserted elements return always true • False positive performance: f (memory / # elements) – Does not depend on the nature (form, size) of the element • Low (constant) processing time and low memory requirements per entry. Our first naïve p-bank Bloom-filter-based switching approach: – Bloom filter membership-problem Is labelx in outport Py? 11001010010 … 011111001010 ?

The SPSwitch switching engine Is packet label X in forwarding port P?

Limitations of standard BFs a) lack of associated values b) expensive deletion c) no notion of time d) unbalanced usage of memory per outport We need a more flexible (probabilistic) data structure! → d-left Fingerprint Compressed Filter (FCF)* *recent results by Bonomi et al. (2006)

Statefull d-left FCF approach

Experimental results Problem: Still O(n) entries

Compact forwarding Algorithmic techniques Compact forwarding as two extreme set membership-problems solved with probabilistic data structures (i. e. , Bloom filters): SPSwitch: Is packet label X in forwarding port P? • State in the network • Large Bloom filters maintained in forwarding tables Limitations: O(n) entries (compact: few bits per element but state required per content object or per flow) i. BFs: Is outbound link A in packet header Z? • State in the packet header • Small in-packet Bloom filter representing a source route

In-packet Bloom filter compact forwarding State in the packet headers – Each network link has an identity and (a series of) Link IDs: Bloomed LID: 256 bit vector with just k=5 bit positions set to one – Forwarding on small, fixed size i. BF representing a source route – Formed by topology functions or collected on path (cf. IP switching) i. BF = LID 1 OR LID 2 LID 1 LID 2 Basic operation: “Is outbound link A in packet header Z? ” – Small forwarding tables – Very fast switching (bitwise AND operations)

i. BF forwarding Practical results Stateless multicast with 256 -bit i. BFs (2 x IPv 6 addresses) Enough for Internet-wide unicast and sparse multicast in typical WAN (35 links -> approx. 20 edge nodes) # Users Zipf distribution of multicast traffic # Groups stateful stateless

Compact forwarding Algorithmic techniques Compact forwarding as two extreme set membership-problems solved with probabilistic data structures (i. e. , Bloom filters): SPSwitch: Is packet label X in forwarding port P? • State in the network • Large Bloom filters maintained in forwarding tables Limitations: O(n) entries (compact: few bits per element but state required per content object or per flow) i. BFs: Is outbound link A in packet header Z? • State in the packet header • Small in-packet Bloom filter representing a source route Limitations: False positives (efficiency, loops, & security)

i. BF algorithmic extensions: Power of choices • Better forwarding efficiency with a simple trick – Define d different Link ID Tags (LITs) instead of a single LID – LIT has the same size as LID, and also k bits set to 1 • Route creation and packet forwarding – Calculate d different candidate z. Filters – Select the best performing z. Filter, based on some policy

i. BF algorithmic extensions: d Link ID Tags • Performance gains (better false positive rates) – better forwarding efficiency | more links • + Functional gains: Avoid system critical false positives (e. g. , inter-domain links, low BW links, loops)

i. BF algorithmic extensions: Security Shortcomings of plain i. BF forwarding • Replay attacks • Computational attacks • Traffic injection attack Goal: Secure the data path and empower the receiver. • Ensure (probabilistically) that hosts cannot send un-authorized traffic Solution: (z-Formation): Dynamically compute LIT at line speed and bind it to: • path: in-coming and out-going port • time: periodically changing keys (per node/group) • flow: packet identifier (e. g. IP 5 -tuple / content id)

i. BF algorithmic extensions: z-Formation (secure i. BFs) • Form LITs algorithmically – at packet handling time – LIT(d) = Z (I , K (t), In, Out, d) • Secret periodic key K – K(t) and K(t-1) • Input port index • Output port index • Flow ID from the packet, e. g. – Content ID – IP addresses & ports • d from the packet header

i. BF algorithmic extensions: self-routing capabilities Secure i. BF works both as a forwarding ID and a capability • To send, a host needs to know or guess a valid i. BF • i. BF become expirable and content-dependent z-Formation • Traffic injection difficult (due to binding to incoming port) • Very hard to construct one valid i. BF without knowing keys along the path – Correlation attacks limited to flow IDs • Need a cryptographically good and fast Z implementation • Allow the receiver to control the i. BF renewal (feedback loop to the sender)

BF algorithmic extensions: Compactly removing elements • How to delete elements from the i. BF? Requirements: • No false negatives upon element deletion. • Fixed memory allocation (m bits). • Low impact on the false positive rate (fpr), No previous work provides memory-efficient element deletions: • • Counting Bloom filter designs extend 1 -bit cells to c-bit counters i. e. , c*m. The d-left CBF requires about 2*m. BF with variable-length signatures (VLF) are prone to false-negatives. Spectral Bloom filters, and “an optimal Bloom filter replacement” are complex and static unpractical solutions.

BF algorithmic extensions: The deletable Bloom filter (Dl. BF) Key idea: Small bitmap to keep record of the bit regions where collisions happen.

BF algorithmic extensions: Dl. BF: From theory to practice • How many elements can be removed in practice?

Compact forwarding in practice APPLICATIONS

Publish/Subscribe Architecture: RTFM Where i. BFs were born… [LIPSIN] rendezvous layer establishes contact topology layer selects routes forwarding layer delivers data

Data Center Networking

Some issues with conventional DC designs Networking constraints of traditional L 2/L 3 hierarchical forwarding: – Fragmentation of resources (VLAN, subnetting) – – Limited server-to-server capacity (high oversubscription) Ethernet scalability (FIB size, STP, flooding, ARP broadcast) Low performance under cloud application traffic patterns Reliability: 2 is a poor choice for redundancy at scale

i. BF-based Data Center Networking Compactly represent a source route between To. Rs using a Bloom filter Carry the 96 -bit i. BF in the source and destination MAC fields Stateless forwarding by querying next-hop switches in the i. BF MAC re-writing at source and destination To. Rs (per flow state reqs. ) Benefits: Large L 2 domains, VM agility, app-specific load-balancing, resource-friendly. . .

Inter-Domain Multicast

Summary of the contributions and future work CONCLUSIONS

Conclusions • Forwarding in content-oriented networks is challenging. • This Thesis contributes to question the commonly held view that networking needs to be centred on endpoint (network interface) identifiers. • The proposed compact forwarding methods explore a new avenue by relaxing the port forwarding function on arbitrary labels to tolerate extra outcomes - trading state for bandwidth efficiency. • We have contributed to turn one-sided error data structures into practical forwarding components, paving the way for a correct, secure, and scalable content-oriented forwarding plane.

Contributions and Publications

Trabalhos Publicados Pelo Autor A. C. Esteve Rothenberg, F. Verdi and M. Magalhães. “Towards a new generation of information-oriented internetworking architectures. ” In ACM Co. Next, First Workshop on Re-Architecting the Internet (Re- Arch 08), Dec. 2008, Madrid, Spain. B. P. Jokela, A. Zahemszky, C. Esteve Rothenberg, S. Arianfar, and P. Nikander. “LIPSIN: Line Speed Publish/Subscribe Inter-Networkings. ” In ACM SIGCOMM’ 09, Aug. 2009, Barcelona, Spain. C. A. Zahemszky, A. Császár, P. Nikander and C. Esteve Rothenberg. “Exploring the Pub/Sub Routing & Forwarding Space. ” In IEEE ICC, Workshop on the Network of The Future, Jun. 2009, Dresden, Germany. D. C. Esteve Rothenberg, P. Jokela, P. Nikander, M. Särela and J. Ylitalo. “Self-routing Denialof-Service Resistant Capabilities using In-packet Bloom Filters. ” In 5 th European Conference on Computer Network Defense (EC 2 ND), Nov. 2009, Milan, Italy. 5. C. Esteve Rothenberg. “Re-architecting the Cloud Data-Center Networks. ” In IEEE Com. Soc Optical Networking Technical Committee (ONTC) Newsletter PRISM, Vol 1. No 2. March 2010. 6. F. L. Verdi, C. Esteve Rothenberg, R. Pasquini and M. F. Magalhães. “Novas Arquiteturas de Data Center para Cloud Computing. ” Book Chapter, SBRC, Gramado, Brazil, May 2010 E. C. Esteve Rothenberg, C. A. Macapuna, F. L. Verdi, M. F. Magalhães and A. Zahemszky. “Data center networking with in-packet Bloom filters. ” In 28 th Brazilian Symposium on Computer Networks (SBRC), Gramado, Brazil, May 2010.

Trabalhos Publicados Pelo Autor 8. A. Zahemszky, B. Gajic, C. Esteve Rothenberg, C. Reason, D. Trossen, D. Lagutin, J. Tuononen and K. Katsaros. “Experimentally-driven research in publish/subscribe information-centric inter-networking. ” In Tridentcom 2010, May. 2010, Berlin, Germany. F. C. Esteve Rothenberg, C. A. Macapuna, F. L. Verdi and M. F. Magalhães. “The Deletable Bloom Filter: A new member of the Bloom family. ” In IEEE Communication Letters, June 2010. G. C. Esteve Rothenberg, C. A. Macapuna, F. L. Verdi, M. F. Magalhães and A. Wiesmaier. “In-packet Bloom filters: Design and networking applications. ” Accepted in Elsevier Computer Networks. H. M. Särela, C. Esteve Rothenberg, A. Zahemszky, J. Ött and P. Nikander. “Bloom. Casting: Security in Bloom ﬁlter based multicast. ” In Nordsec 2010, Oct. 2010, Espoo, Finland. 12. M. Särela, C. Esteve Rothenberg, T. Aura, A. Zahemszky, J. Ött and P. Nikander. “Forwarding Anomalies in Bloom Filter Based. Multicast. ” Accepted in IEEE Infocom 2011. 13. Carlos A. B. Macapuna, C. Esteve Rothenberg and M. F. Magalhães. "In-packet Bloom ﬁlter based data center networking with distributed Open. Flow controllers". In 2 nd IEEE MENS 2010, Dec. 2010, Miami, Florida, USA 14. S. Tarkoma, C. Esteve Rothenberg and E. Lagerspetz. “Theory and Practice of Probabilistic Filters for Distributed Systems. ” To appear in IEEE Communications Surveys and Tutorials. .

Resultados de Propriedade Intelectual • International Patent Application PCT/EP 2008/063647, Patent: “Packet Forwarding in a Network, ” Applicant: Ericsson AB, Publication number: WO 2010/022799 (A 1), Publication date: 2010 -03 -04, Date of Filing: 200810 -10, Inventors: JOKELA, Petri, ESTEVE, Christian, KJÄLLMAN, Jimmy, NIKANDER, Pekka, RINTA-AHO, Teemu, YLITALO, Jukka, • International Patent Application PCT/EP 2009/062785, Patent: “Secure In. Packet Bloom ﬁlter applications, ” Applicant: Ericsson AB, Date of Filing: 2009 -06 -10, Inventors: ESTEVE, Christian, JOKELA, Petri, NIKANDER, Pekka, SÄRELÄ, Mikko, YLITALO, Jukka.

Future work • Active area of networking with i. BFs – GMPLS, mobility, inter-domain [PURSUIT], optical – Open questions: E. g. , Dl. BF dynamics (Keep doing insertion and deletions). Other compact and more flexible bitmaps? • Explore more applications and networking scenarios – Si. BF: Remove middlebox Bloomed Service IDs + beyond VLB – Bloom. Casting: Remove AS edges (permutating Dl. BFs) – CCN: Relief the core from PIT entries with topology-carrying i. BFs. SPSwitch engine for structured CCN names (e. g. , a/b/c/d/: hash) ? – i. BFs: fault-carrying, multi-path, path information (e. g. , measurements) • Right balance between edge-, network- and packet state? • Plus many broader architectural questions of contentoriented networks

Questions?

BACKUP

Compact forwarding in content-oriented networks Traditional Forwarding Compact Forwarding • Scale by – Structural aggregation (hierarchical IP) – Timely information (e. g. Ethernet) • Deterministic algorithms – Forward on longest prefixes (e. g. , Tree bitmap IP lookup) – Exact lookup matches (e. g. , MPLS, Ethernet) • Focus on unicast • Sender-oriented – Synthetic aggregation (lossy compression) – Trading state for over deliveries (flexible operation point) – Moving (approximate) forwarding state to packets • Probabilistic algorithms – Trade correctness for space/time requirements – Prone to one-sided errors (portforwarding w/ extra packet dups. ) • Focus on multicast • Receiver-oriented

Principles in action

Observation Fundamentals of the Internet • Collaboration – Reflected in forwarding & routing • Cooperation – Reflected in trust among participants • Endpoint-centric services – Mail, FTP, even Web – Reflected in E 2 E principle IP, full end-to-end reachability Reality in the Internet • Current economics favor senders – Receivers are forced to carry the cost of unwanted traffic • Phishing, spam, viruses – There is no trust any more • Information-centric services – Do end-points really matter? – Information retrieval through, e. g. , CDNs, P 2 P IP with middleboxes & significant decline in trust Source: EU FP 7 PSIRP Project

SPSwitch forwarding engine model [Re-Arch 08] implementation (FCF d-left hash Table) Is packet label X in forwarding port P? Problem: Still O(n) entries

Practical fpr gains

False positive performance m bit array, k independent hash functions , n elements inserted Given m and a target capacity n, there is an optimal number of hash functions k = ln 2 * m/n False positive rate depends solely on the ratio m/n fpr approx 0. 61 m/n or approx 0. 5 k Independently from the form/size of the elements

The revised main point of Bloom filters § Whenever you have a set or list or function and space is an issue, an approximate representation, like a Bloom filter may be a useful alternative. § Just be sure to consider the effects of the false positives! § Ask the right question: § Is element x not in this set? No false negatives § Is element x in this set? Handle low false positives § M / n ratio determines false positive rate § Elements can be potentially anything § Node IDs, network paths, packets, domains, ontology trees, names, interfaces, signatures, etc. 53

Design Principles • Separating Names from Locations – IP for VM identification, pure “L 2” connectivity • Source explicit routing with zero-overhead – Stateless intermediate switching based on the i. BF • Direct network control and logically centralized directory – Rack Managers install flows at To. Rs and jointly maintain topology and server dir. • Load balancing through path randomization – Exploit path multiplicity to provide oblivious routing (i. e. , traffic independent randomized packet routing) [Valiant Load Balancing ] • Unmodified end-points and plug & play – Legacy servers and applications are supported off-the-shelf. – Auto-configuration of end-hosts and switches (Role Discovery Protocol) • Design for failure – Assume any component will fail (built-in fault-tolerance)

Traffic Efficiency Optimizations § virtual links § tunneling with tree/forest support § created by topology layer when useful § alternative link identifiers § multiple identifier sets to choose from § pick set with least false positives § do-not-forward link identifiers § included in forwarding identifiers § requested downstream, created upstream 55

Security properties • z-Filter works both as a forwarding ID and a capability – To send, a host needs to know or guess a valid z. Filter • z-Formation – Bound to the incoming and outgoing ports • Traffic injection difficult (due to binding to incoming port) – Very hard to construct one without knowing keys along the path • Correlation attacks possible only for a given flow ID – Bound to the packet stream (flow ID) – Need a cryptographically good and fast Z algorithm

Injection attacks • Assuming attacker knows a z. Filter passing at h hops distance from attacker • Left y-axis shows the probability of a single packet reaching target for various fill factors • Right y-axis shows the average number of attempts for one successful injection with probability 0. 5

Discussion • Replay attacks: limited to the key lifetime – As z. Filters are tied to periodically changing keys (K(t)), one per node, the capabilities become expirable • Brute force attack: “Best” attack strategy – Assuming attack traffic of 1 M pps (1 Gbps / 1000 bits pp) > 40 min to guess (with Pr=0. 5) one 5 -hop working z. Filter (which is only usable for single host) • Re-keying time? – Trade-off between minimizing duration of unwanted traffic vs. overhead of z. Filter renewal e. g. , 1 min enough to complete transactional traffic + protect short paths • Attack detection and mitigation: – fpr increase triggers detection plus e. g. blacklist mechanism on Flow. ID

Challenges & Approach • Common challenge in data-oriented paradigms: – Take switching decisions • at wire speed (Gbps) • on a large universe (e. g. , 256 -bit hash values) • of flat (non-aggregatable) identifiers • A probabilistic approach – Let’s take advantage of the data-oriented paradigm: – Pub/sub inherently tolerates false positives – Opportunistic caching

Secure i. BF conclusions • Can we deliver packets securely without addresses? • z-Formation is a compact and secure source routing method • Forwarding identifiers are – small and efficient to compute • Capability-like properties – expirable, – bound to packet flow, – content/communication intention • Stateless – No need for per flow state – Forwarding with zero FIB table lookups

Virtual links State in network nodes One-to-one, one-to-many, many-to-one forw. structures Supporting horizontal and/or hierarchical aggregation Less overdeliveries

RTFM Architecture* • Rendezvous: Matches subscriptions to publications. • Topology: Creates and maintains delivery trees used forwarding traffic. • Forwarding: Actual data delivery operations. (label switching and fast forwarding tables) • Mediation: Node-to-node link data transfer & More (opportunistic caching, collaborative and network coding) • Metadata and hash-based identifiers Identifiers space (approx. ) • 1015 rendezvous identifiers (256 -bit Ri. D) • 1010 scope identifiers (256 -bit Si. D) • Forwarding identifies (256 -bit Fi. D) Fid *[Särelä et al. 2008] Flat IDs … Sid Rid Data

Bloom filter fundamentals m bit array k independent hash functions Hash functions are Bloom Filters with k = 1 n elements inserted Insert_element() Check_element() “yes” / no ?

Forwarding Efficiency Optimizations § virtual links § tunneling with tree/forest support § created by topology layer when useful § alternative link identifiers § multiple identifier sets to choose from § pick set with least false positives § do-not-forward link identifiers § included in forwarding identifiers § requested downstream, created upstream 64