Traffic 1993 2000 Traffic Verbal Datastat Modsim Analysis
Traffic (1993 -2000) Traffic Verbal Data/stat Mod/sim Analysis Synthesis • Heavy tails (HT) in net traffic? ? ? • Careful measurements • Appropriate statistics • Connecting traffic to application behavior • “optimal” web layout HT files HT traffic
time Is streamed out on the net. Creating fractal Gaussian internet traffic (Willinger, …) log(> size) Heavy tailed files > 1. 0 p s- log(file size)
Traffic (1993) Traffic Verbal • Traffic is “bursty”?
Traffic (1993 -2000) Traffic Verbal Data/stat Why? • Bursty? ? ? • Careful measurements • Appropriate statistics
Heavy tailed files Traffic Verbal time Data/stat Mod/sim Long space Becomes long time Why?
time log(> size) Heavy tailed files > 1. 0 p s- log(file size) Traffic Verbal Data/stat Mod/sim Analysis
time log(> size) Heavy tailed files > 1. 0 p s- log(file size) W h ? t a
6 5 Frequency (Huffman) (Crovella) 4 Cumulative Data compression WWW files Mbytes 3 Forest fires 1000 km 2 2 (Malamud) 1 0 -1 -6 -5 Decimated data Log (base 10) -4 -3 -2 -1 0 1 Size of events 2
Probability that a packet is in a file bigger than x. Probability that a file is bigger than x.
6 Web files 5 Codewords 4 Cumulative Frequency -1 3 Fires 2 -1/2 1 0 -1 -6 Log (base 10) -5 -4 -3 -2 -1 0 1 Size of events 2
6 5 4 Cumulative Frequency Data compression WWW files Mbytes exponential -1 3 Forest fires 1000 km 2 2 -1/2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 Size of events 2
6 Data compression WWW files Mbytes 5 exponential 4 Cumulative Frequency 3 Forest fires 1000 km 2 2 All events are close in size. 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 Size of events 2
6 5 4 Cumulative Frequency Data compression WWW files Mbytes -1 3 Forest fires 1000 km 2 Most 2 events are small 1 0 -1 -1/2 But the large events are huge -6 -5 -4 -3 -2 -1 0 1 Size of events 2
Data + Model/Theory 6 DC 5 WWW 4 3 FF 2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 2
6 Cumulative Frequency 5 WWW files Mbytes 4 (Crovella) Most files are small (mice) 3 2 Most packets are in large files (elephants) 1 0 -1 -6 -5 Decimated data Log (base 10) -4 -3 -2 -1 0 1 Size of events 2
Router queues Mice Delay sensitive Sources Network Bandwidth sensitive Elephants Unfortunate interaction of files with congestion control
time log(> size) Heavy tailed files > 1. 0 p s- log(file size) W ? y h
6 Data compression WWW files Mbytes 5 exponential 4 Cumulative Frequency All events are close in size. 3 2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 Size of events 2
Based on frequencies of source word occurrences, Select code words. To minimize message length. Source coding for data compression
Data 6 5 How well does the model predict the data? DC 4 3 2 1 0 -1 0 1 2
Data + Model 6 5 How well does the model predict the data? DC 4 3 Not surprising, because the file was compressed using Shannon theory. 2 1 0 -1 0 1 2 Small discrepancy due to integer lengths.
Generalized “coding” problems • Minimize avg file transfer • No feedback • Discrete (0 -d) topology • Minimize avg file transfer • Feedback • 1 -d topology Web Data compression
document Traffic Verbal split into N files to minimize download time Data/stat Mod/sim Analysis Synthesis A toy website model (= 1 -d grid HOT design)
Probability of user access Wasteful Hard to navigate.
Just right Wasteful Hard to navigate.
More complete website models (Zhu, Yu) • Detailed models – user behavior – content and hyperlinks • • Necessary for real web layout optimization Statistics consistent with simpler models Improved protocol design (TCP) Commercial implications still unclear
Traffic (1993 -2000) Traffic Verbal Data/stat Mod/sim Analysis Synthesis • Heavy tails (HT) in net traffic? ? ? • Careful measurements • Appropriate statistics • Connecting traffic to application behavior • “optimal” web layout HT files HT traffic
Data 6 DC 5 WWW 4 3 2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 2
Data + Model/Theory 6 DC 5 WWW 4 3 2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 2
Data + Model/Theory 6 5 WWW 4 Are individual websites 3 distributed like this? 2 1 Roughly, yes. 0 -1 -6 -5 -4 -3 -2 -1 0 1 2
Data + Model/Theory 6 DC 5 WWW 4 How has the data 3 changed since 1995? 2 1 0 -1 -6 -5 -4 -3 -2 -1 0 1 2
Traffic (1993 -2000) Traffic Topology Layering Verbal Data/stat Mod/sim Analysis Synthesis C&D
Theory and the Internet Traffic Topology Verbal Data/stat Mod/sim Analysis Synthesis C&D Layering
Router queues Mice Sources Network Elephants
Router queues Mice Delay sensitive Sources Network Bandwidth sensitive Elephants Unfortunate interaction of files with congestion control
Router queues Mice Delay sensitive Better Control Sources Network Bandwidth sensitive Elephants Fortunate interaction of files with improved congestion control
High variability in context More high variability • Heterogeneity • Human behavior • Actuating Extend • Optimization • Layer/distribute • Dynamics/control Today: • Simplify/broaden • Look back/sideways Develop • Delays • Actuation
- Slides: 37