Packet Classification Using CoarseGrained Tuple Spaces Haoyu Song

Overview n Two-dimensional packet classification problem » in list of 2 d filters, find

Cross-Product Method n Procedure » do 1 d lookup on all fields » combine

2 D Tuple Space Search Destination IP Prefix Length 0 0 32 n Group

Coarse-Grained Tuple Space Destination IP Prefix Length 0 32 n Select coarse-grained partition of

Performance of Basic Algorithm n Equal size divisions of 2 d tuple space n

Performance of Best Configurations 4 x 3 x 2 x 7 - Haoyu Song,

Destination IP Prefix Length Alternate Partitioning Approaches 0 32 0 n Arbitrary possible sub-spaces

Fast 1 d Lookups Tree Bitmap 0 0 110 1011 1 0 0 1

Fast and Compact 1 d Lookups subtree hash tables Bloom filters 1 0 0

1 d Lookup Performance 200 K IPv 4 prefixes 5 bit stride for tree

Practical Configuration n Configure 1 d lookups for 1 off-chip probe each (excluding false

Possible Extensions n More extensive evaluation » scaling to larger filter sets – 100

Slides: 13

Download presentation

Packet Classification Using Coarse-Grained Tuple Spaces Haoyu Song, Jon Turner and Sarang Dharmapurikar www. arl. wustl. edu

Overview n Two-dimensional packet classification problem » in list of 2 d filters, find first match for given address pair n (1011, 0111): [<101*, 10*>, <10*, 011*>, <1*, 01*>] n Limitations of current solutions » fast algorithmic methods require excessive space (≥ 50 x) » TCAM has high cost per bit, significant power usage n Combining cross-product and tuple-space search » hybrid strategy with range of time-space tradeoff options n Improving 1 d lookups » combining tree bitmap and Bloom filters n Possible extensions 2 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Cross-Product Method n Procedure » do 1 d lookup on all fields » combine results into lookup key in cross-product table n direct lookup table or hash table filter set F 0: <1010*, 01*> F 1: < 101*, 0111*> 10100, 01110 S 0 D 1 cross product table key S 0 D 0 S 0 D 1 S 1 D 0 S 1 D 1 n Fast, but space grows as nk for n filters, k fields 3 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020 filter F 0 none F 1

2 D Tuple Space Search Destination IP Prefix Length 0 0 32 n Group by prefix length » hash table per group » up to 33 x 33= 1, 089 groups » in practice 30 -100 occupied tuples n Rectangle search » markers to guide search » at most 33 probes, often less » hard to update n 32 Source IP Prefix Length Pruned tuple space search » 1 d search on src/dest fields » find prefix lengths that match src/dest fields of packet » search intersecting tuples » if ≤k matching prefixes, at most k 2 probes 4 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Coarse-Grained Tuple Space Destination IP Prefix Length 0 32 n Select coarse-grained partition of tuple space n Build cross-product table per sub-space n Search procedure 0 » 1 d lookups for LPM » probe each subspace » terminate early if possible n Pruning 32 Source IP Prefix Length » identify candidate subspaces during 1 d lookup » probe selected sub-spaces n Space/time 5 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020 tradeoff

Performance of Basic Algorithm n Equal size divisions of 2 d tuple space n Ratio of cross-products to filter set size n 2 x 2 partition brings space usage to 2 x minimum » maximum of four probes required » compared to 30 -90 for simple tuple space search n Pruning of limited use for filter sets of size <104 6 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Performance of Best Configurations 4 x 3 x 2 x 7 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Destination IP Prefix Length Alternate Partitioning Approaches 0 32 0 n Arbitrary possible sub-spaces are » potential for fewer regions with good space efficiency n Preliminary results mixed » may be useful for smaller filter sets n More evaluation needed 32 Source IP Prefix Length Note: filters of form <prefix, *> and <*, prefix> stored in 1 d data structures 8 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Fast 1 d Lookups Tree Bitmap 0 0 110 1011 1 0 0 1 Hashing + Bloom Filters 1 0 0 0 1 Multibit trie n Co-located children n Bitmaps for n » prefix nodes » subtree presence n 0 1 4 bit stride implies 8 memory accesses Bloom Filters 1 0 1 1 0 1 1 0 off-chip hash tables 1 0, 1 3 110, 111, 000, 001 5 10100, 10101, 10110, 10111 Expand prefixes to “standard” lengths n Off-chip hash table per length n On-chip Bloom filters to avoid unproductive probes n Large space requirements for good worst-case performance n 9 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Fast and Compact 1 d Lookups subtree hash tables Bloom filters 1 0 0 0 1 1 1 0 1 0 2 1 1 0 1 4 0 1 1 n Insert tree bitmap subtree roots into off-chip hash tables and on-chip Bloom filters n Lookup prefix of subtree roots in Bloom filters » if match on length k and all shorter lengths, probe offchip table for length k n Reduction in on-chip memory for Bloom filters » shape-shifting trie yields further space reduction 10 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

1 d Lookup Performance 200 K IPv 4 prefixes 5 bit stride for tree bitmap 8 bit on-chip “root table” 4 Bloom filters 1 BF entry for every 2 prefixes 1 off-chip probes (4 incl. FP) 2 Bloom filters 1 BF entry for every 6 prefixes 2 off-chip probes (4 incl. FP) 11 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Practical Configuration n Configure 1 d lookups for 1 off-chip probe each (excluding false positives) » about 5 bits per prefix for Bloom filters with low FP rate n Record <prefix, *> and <*, prefix> filters in 1 d lookup data structures » also proposed in recent paper by Kounavis, et. al. n Divide remaining filters among four subspaces » approximately 2 off-chip hash table entries per filter » at most four probes n With single QDR SRAM at 200 MHz, 32 bit word size can do 200 million probes per second » about 33 million packets/second » 40 byte packets at 10 Gb/s 12 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020

Possible Extensions n More extensive evaluation » scaling to larger filter sets – 100 -200 K filters » integrated evaluation of 1 d and 2 d lookups » systematic evaluation of alternate partitioning strategies n Alternate representations of filter sub-spaces » any filter set data structure is candidate » using decision trees, can skip 1 d lookups n Generalization to more dimensions » handling fields with ranges (for port numbers) » coarse-grained grouping of tuple-spaces defined on “nesting level” » can we beat TCAM? 13 - Haoyu Song, Jonathan Turner and Sarang Dharmapurikar - 9/17/2020