Outline Introduction Background Proposed Scheme UCPE Experiment Results
Outline Introduction Background Proposed Scheme (UCPE) Experiment Results Conclusion National Cheng Kung University CSIE Computer & Internet Architecture Lab 2
Introduction (1/3) IP Address Lookup Longest Matching Prefix Lookup Destination IP (4 -bit) 0001 1111 Prefixes a 0* b 00* c 001* d 10* match b no match National Cheng Kung University CSIE Computer & Internet Architecture Lab 3
Introduction (2/3) BGP routing table update IPMA reports the frequency of routing prefixes announced and withdrawn by BGP is about 25 prefix updates/sec, which means real update speed is 40 millisec in average. In all update prefixes over a period time, only 2. 5% of them are unique. Many of the routing updates are update changes. National Cheng Kung University CSIE Computer & Internet Architecture Lab 4
Introduction (3/3) Trie-based IP Lookup Scheme Input IP: match c 1011 match e 1001 Binary Trie Prefixes a 000* b 01* c 10* d 1000* e 1001* f 110* g 111* 0 Multibit Trie 1 00 0 1 0 b 0 a 0 d 1 e 10 b c 11 1 c 0 01 0 1 f g 00 01 a a 10 11 00 01 d e 10 11 00 01 f National Cheng Kung University CSIE Computer & Internet Architecture Lab 10 11 f g g 5
Introduction (3/3) Trie-based IP Lookup Scheme Multibit Trie Binary Trie Prefixes a 000* b 01* c 10* d 1000* e 1001* f 110* g 111* 0 1 00 0 1 0 b 0 a 0 0 1 f g 1 a a d e 00 01 10 11 00 01 10 b c 11 1 c 0 01 00 01 f f g g a a 10 11 00 01 d e 10 11 00 01 f 10 11 f g g 10 11 00 01 10 11 National Cheng Kung University CSIE Computer & Internet Architecture Lab 6
Motivation Search time in a multibit trie based IP lookup approach can be bounded; fast search speed can be guarantee. Multibit trie approach also cause more memory cost and update cost because of prefix expanded. Proposed algorithm can find a ideal construction of a multibit trie to reduce update cost cause by prefix expansion. National Cheng Kung University CSIE Computer & Internet Architecture Lab 7
Outline Introduction Background Basic scheme of Multibit tries Controlled Prefix Expansion Proposed Scheme (UCPE) Experiment Results Conclusion National Cheng Kung University CSIE Computer & Internet Architecture Lab 8
Basic Scheme of Multi-bit Trie (1/4) 8. 8 Level 1 Level 2 Level 3 Level 4 16. 8. 8 8 8 Level 0 16 Level 1 Level 2 Level 3 8 8 Stride: the number of bits to be inspected Ex: 4 -level 8. 8 (8 -16 -24 -32) 3 -level 16. 8. 8 (16 -24 -32) National Cheng Kung University CSIE Computer & Internet Architecture Lab 9
Basic Scheme of Multi-bit Trie (2/4) Fixed-Stride Trie (FST) Fix stride: same level has same stride size Variable stride: otherwise 00 10 01 stride=2 b Prefixes a 000* b 01* c 10* d 1000* e 1001* f 110* g 111* 11 c Binary Trie 0 0 0 a 00 01 1 1 b a a 0 c 0 10 11 1 0 f 00 01 d e 10 11 00 01 f f 10 11 g g stride=2 Variable-stride Trie (VST) 1 g 0 1 d e 00 01 10 11 stride=2 b 0 a 1 c 00 01 stride=2 d e 10 11 0 1 f National Cheng Kung University CSIE Computer & Internet Architecture Lab g stride=1 10
Basic Scheme of Multi-bit Trie (3/4) Represented by binary trie node 00 10 01 b 00 01 a a 10 11 Represented by multibit trie node 11 c 00 01 d e 10 11 b c 00 01 f f 10 11 g g a a d e f f g g National Cheng Kung University CSIE Computer & Internet Architecture Lab 11
Basic Scheme of Multi-bit Trie (4/4) 8. 8 (8 -16 -24 -32) 8 Level 1 8 Level 2 Level 3 Level 4 8 28 8 16. 8. 8 (16 -24 -32) Level 0 16 Level 1 Level 2 Level 3 216 28 8 A multibit trie is a trie where each trie node has 2 k children, where k is the stride. What trie is better? for 4 -level, 16. 4. 4. 8 (16 -20 -24 -32) has less memory cost (number of children) National Cheng Kung University CSIE Computer & Internet Architecture Lab 12
Controlled Prefix Expansion It is a storage optimization of multibit tries proposed by Srinivasan and Varghese in 1999[1]. For a given k, Controlled Prefix Expansion(CPE) can find the best k-level multibit trie which cost minimum memory requirement for a given prefix table. Based on dynamic programming. [1] V. Srinivasan and G. Varghese, “Fast address lookups using controlled prefix expansion, ” ACM Transaction on Computer System, vol. 17, no. 1, pp. 1– 40, Feb. 1999. National Cheng Kung University CSIE Computer & Internet Architecture Lab 13
CPE Fixed-Stride Algorithm (1/2) T[W, k] indicate the minimum memory cost of all k-level tries with maximum length W-bit. For IPv 4, W=32. node(i) number of non-leaf nodes at binary trie level i. When k=1: T[32, 1] T[W, 1]=2 w Level 0 When k>1: find level k-1 32 Level 1 . . . Memory cost = 232 14
CPE Fixed-Stride Algorithm (1/2) T[W, k] indicate the minimum memory cost of all k-level tries with maximum length W-bit. For IPv 4, W=32. node(i) number of non-leaf nodes at binary trie level i. When k=1: When k>1: T[W, k]= T[W, 1]=2 w find level k-1 T[32, 4] Level 0 Memory cost = (# of nodes) × (node size) = node(m) × 232 -m 32 m ? Memory cost = T[m, k-1] Level 3 m : 3 ~ 31 32 -m Level 4 232 -m 15
CPE Fixed-Stride Algorithm (2/2) Level 0 m Cost = T[m, k-1] j Level k-1 Cost = node(m) × 2 j-m Level k j – highest level in binary trie k – highest level in multibit trie node(i) : number of non-leaf nodes at trie level i. National Cheng Kung University CSIE Computer & Internet Architecture Lab 16
Controlled Prefix Expansion Time complexity of CPE for fixed-stride tries is O(k. W 2) , where k is total # of levels in multibit trie and W is prefix length. By following the same idea, the minimum memory cost of variable-stride tries can be found. National Cheng Kung University CSIE Computer & Internet Architecture Lab 17
Proposed scheme We proposed Update-aware Controlled Prefix Expansion (UCPE) by following the concept of CPE. For a given k, and a selected table: ◦ original CPE algorithm can find the k-level trie with minimum storage cost (storage-optimized); ◦ our algorithm finds the best k-level multibit tries with minimum update cost (update-optimized). National Cheng Kung University CSIE Computer & Internet Architecture Lab 18
Updating Multibit trie Update steps of 1. Search (read) 2. Write element Prefixes a 000* b 01* c 1* d 1000* e 1001* f 110* g 111* a prefix: Binary Trie 0 0 0 a Multibit Trie 1 1 b 0 0 c 0 f Level 1 1 g 0 1 d e c Level 2 4 a a b b d e f f g g Update Cost = Number of reads * Cost of read + Number of writes * Cost of write Update Cost of a = 1 × Cost of read + 24 -3 × Cost of write Update Cost of b = 1 × Cost of read + 24 -2 × Cost of write 19
UCPE Fixed-Stride Algorithm (1/4) Optu[W, k] indicate the minimum Update cost of all k-level tries with maximum length W-bit. For IPv 4, W=32. TR – memory access cost of Read Tw – memory access cost of Store probj(l) probability of prefix length is l between length is 1~j in update traffic, in practice, we use prefix distribute to be probabikity. National Cheng Kung University CSIE Computer & Internet Architecture Lab 20
UCPE Fixed-Stride Algorithm (2/4) When k=1: T[W, 1]= When k>1: find level k-1 Opt[32, 1] Level 0 32 l = prefix length 32 -l. . . Level 1 Update cost = 21
UCPE Fixed-Stride Algorithm (3/4) When k=1: T[W, 1]= When k>1: find level k-1 Opt[32, 4] Level 0 m : 3 ~ 31 Level 3 32 ? m Update cost = Opt[m, k-1] × l Level 4 . 32 -l 232 -m Prefix update cost = 22
UCPE Fixed-Stride Algorithm (4/4) Level 0 m Cost = Optu[m, k-1] j Level r-1 Cost = Level k prob(l): probability of prefix length is l between length is 1~j in National Cheng Kung University update traffic. CSIE Computer & Internet Architecture Lab 23
UCPE Fixed-Stride Algorithm (4/4) Level 0 m Cost = Optu[m, k-1] j Level r-1 Cost = Level k pfx(l): number of prefixes with length l in prefix table. National Cheng Kung University CSIE Computer & Internet Architecture Lab 24
UCPE Algorithm for Variable-Stride Tries (1/2) R Level 0 Update cost = s Level 1 . . . T 1 T 2 Tn-1 Tn Update cost = sum of costs of covering T 1 through Tn using k levels Level k National Cheng Kung University CSIE Computer & Internet Architecture Lab 25
UCPE Algorithm for Variable-Stride Tries (2/2) N Level 0 s pfx(l) height(N) — height of node N in B trie. Ds(N) Level 1 . . . — number of prefix with length l Cost = Q 1 Q 2. . . Q k Cost = Level k — the set of all descendents of N that are at level s of N in B trie. 26
Prefix backup policy Prefixes a 000* b 01* c 10* d 1000* e 1001* f 110* g 111* need to backup when update: Binary Trie 0 0 1 b 0 a Auxiliary data ◦ Array ◦ Binary trie ◦ Link list Multibit Trie 1 0 c 0 1 0 f Update: Insert 1100* (h) delete 1100* b c 1 g 0 1 d e a a d e f fh g g structure: National Cheng Kung University CSIE Computer & Internet Architecture Lab 27
Memory access cost issue We can get different result by change weight of TR, TW When implementation, different backup policy cause different write cost (Tw). In our case, TR: TW is about 1: 3. National Cheng Kung University CSIE Computer & Internet Architecture Lab 28
Outline Introduction Background Proposed Scheme (UCPE) Experiment Results Conclusion National Cheng Kung University CSIE Computer & Internet Architecture Lab 29
Experiment Data Prefix Table Database Number of 24 - Percentage of prefix bit prefixes prefix in 24 -bit canada 157118 85938 54. 6% as 120 k 127071 69678 54. 8% oix-2002 -4 124824 68978 55. 2% oix-2005 -4 163574 85305 52. 1% 41709 25206 60. 4% funnet Number of multibit tries level: 4~8 National Cheng Kung University CSIE Computer & Internet Architecture Lab 30
Prefix Length Distribution of prefix tables # of prefix as 120 k 8 7 6 5 4 3 2 1 0 8 6 4 2 0 0 4 8 # of prefix 12 16 20 Prefix Length 24 28 32 0 4 # of prefix funnet 3 8 12 16 20 24 Prefix Length 28 32 canada 10 2, 5 8 None oix-2005 -4 10 None # of prefix 2 1, 5 6 4 1 2 0, 5 0 0 0 4 8 12 16 20 24 Prefix Length 28 32 0 4 8 12 16 20 24 Prefix Length National Cheng Kung University CSIE Computer & Internet Architecture Lab 31
FST implementation (1/2) Implement trie scheme (DP solution for canada & funnet): 1. CPE 2. UCPE(1, 3) (Proposed scheme with TR : TW = 1: 3) 3. UCPE(1, 1) (Proposed scheme with TR : TW = 1: 1) DP solution for canada CPE UCPE(1, 3) UCPE(1, 1) 4 -level FST 16 -21 -24 -32 17 -21 -24 -32 5 -level FST 16 -20 -22 -24 -32 17 -21 -24 -26 -32 6 -level FST 12 -17 -20 -22 -24 -32 16 -20 -22 -24 -26 -32 17 -21 -24 -26 -27 -32 7 -level FST 11 -16 -18 -20 -22 -24 -32 16 -19 -21 -24 -26 -27 -32 17 -21 -24 -26 -27 -28 -32 8 -level FST 8 -12 -16 -18 -20 -22 -24 -32 16 -20 -22 -24 -26 -27 -28 -32 17 -21 -24 -26 -27 -28 -29 -32 National Cheng Kung University CSIE Computer & Internet Architecture Lab 32
FST implementation (2/2) Simulation Environment: 2. 4 -GHz Pentium IV, 512 KB L 2 Cache 1 GB RAM gcc 3. 3 compiler, optimization level 3 Performance measurements of FST: Average lookup time Average update time Total Memory cost National Cheng Kung University CSIE Computer & Internet Architecture Lab 33
Update Performance of FST for canada & funnet canada 3000 2698 clocks 2500 2000 1988 1500 CPE UCPE(1, 1) 1000 UCPE(1, 3) 500 4 5 6 number of levels 7 8 funnet 2500 2197 clocks 2000 1897 CPE 1500 UCPE(1, 3) 1000 500 4 5 6 numbre of levels 7 UCPE(1, 1) National Cheng Kung University CSIE Computer & Internet 8 Architecture Lab 34
Lookup Performance of FST for canada & funnet clocks canada 1600 1400 1200 1000 800 600 400 200 0 1352 997 838 CPE UCPE(1, 3) UCPE(1, 1) 4 5 6 number of levels 7 8 funnet 1200 1012 clocks 1000 835 800 668 600 CPE UCPE(1, 3) 400 200 0 4 5 6 number of levels 7 UCPE(1, 1) National Cheng Kung University CSIE Computer & Internet 8 Architecture Lab 35
Memory Usage of FST Memory Usage (k. B) canada 7000 6444 6000 5058 4509 5000 4000 CPE 3000 UCPE(1, 3) 2000 UCPE(1, 1) 1000 4 5 6 number of levels 7 8 memory Usage (KB) funnet 2500 2178 2000 1500 1368 CPE 1000 UCPE(1, 3) 500 UCPE(1, 1) 0 4 5 6 number of nodes 7 8 National Cheng Kung University CSIE Computer & Internet Architecture Lab 36
Update Performance of VST for canada Update Execution Time 2500 2128 clocks 2000 1620 1500 CPE UCPE(1, 1) 1000 UCPE(1, 3) 500 3 4 5 number of levels 6 7 National Cheng Kung University CSIE Computer & Internet Architecture Lab 37
Lookup Performance of VST for canada clocks Lookup Time 2000 1800 1600 1400 1200 1000 800 600 400 200 0 1723 1661 CPE UCPE(1, 3) UCPE(1, 1) 3 4 5 number of levels 6 7 National Cheng Kung University CSIE Computer & Internet Architecture Lab 38
Memory Usage of VST elements Thousands Memory Cost 1800 1600 1400 1200 1000 800 600 400 200 0 1529, 489 735, 293 CPE 3 4 5 number of levels 6 7 UCPE(1, 3 ) National Cheng Kung University CSIE Computer & Internet Architecture Lab 39
Outline Introduction Background DP Scheme Proposed Scheme (UCPE) Experiment Results Conclusion National Cheng Kung University CSIE Computer & Internet Architecture Lab 40
Conclusion Our trie schemes provide fast lookup times and fast Insert/Delete times, compared to the original CPE scheme [1]. For multibit tries with large level number, the improvement of update and search speed by our scheme is more significant, with slightly larger memory cost. [1] V. Srinivasan and G. Varghese, “Fast address lookups using controlled prefix expansion, ” ACM Transaction on Computer System, vol. 17, no. 1, pp. 1– 40, Feb. 1999. National Cheng Kung University CSIE Computer & Internet Architecture Lab 41
Thanks for your attention ! 感謝您的參與。 Q&A National Cheng Kung University CSIE Computer & Internet Architecture Lab 42
- Slides: 42