SelfSimilarity of Network Traffic Presented by Wei Lu

Self-Similarity of Network Traffic Presented by Wei Lu Supervised by Niclas Meier 05/06 2004

Table of Content • Network Traffic Study – Motivation – Measurement – Modeling • Classic Model, Poisson or Markovian • Self-Similar Model – What’s Self-Similarity – Definition of Self-Similarity – Explanation of Self-Similarity – Impact on network performance – Adapting to Self-Similarity 2

Motivation for Network Traffic Study • Understanding network traffic behavior is essential for all aspects of network design and operation – Component design – Protocol design – Provisioning – Management – Modeling and simulation 3

Network Traffic Measurement • Collect data or packet traces showing packet activity on the network for different network applications • Purpose – Understand the traffic characteristics of existing networks – Develop models of traffic for future networks – Useful for simulations, planning studies 4

Network Traffic Modeling In the past… • Traffic modeling in the world of telephony was the basis for initial network models – Assumed Poisson arrival process – Assumed Exponential call duration – Well established queuing literature based on these assumptions – Enabled very successful engineering of telephone networks 5

Classic Model • Poisson Process • Markov Chain 0 1 • ON-OFF model Interrupted Poisson Process ON Fixed rate arrival 6 b 2 OFF Active Poisson arrival b Idle

The Story Begins with Measurement • In 1989, Leland Wilson begin taking high resolution traffic traces at Bellcore – Ethernet traffic from a large research lab – Mostly IP traffic (a little NFS) – Four data sets over three year period 7

Actual Network Traffic v. s. Poisson 5, 8, 2 mean 5 Network Traffic Measurement Poisson Traffic Model 8 [Chun Zhang 2003]

What’s Self-Similarity • Self-similarity describes the phenomenon where a certain property of an object is preserved with respect to scaling in space and/or time. (also called fractals) • If an object is selfsimilar, its parts, when magnified, resemble the shape of the whole. 9

Definition of Self-Similarity • Self-similar processes are the simplest way to model processes with long-range dependence • The autocorrelation function k of a process with long -range dependence is not summable: – S k g. Long Range Dependence • e. g. k @ k-b as k g. for 0 < b < 1 • Autocorrelation function follows a power law • Slower decay than exponential process – If S k < . 10 Short Range Dependence

Self-Similarity contd. • Zero-mean stationary time series X = (Xt; t = 1, 2, 3, …), maggregated series X(m) = (Xk(m); k = 1, 2, 3, …) by summing X over blocks of size m. • X is H-self-similar (distributional self-similarity), if for all positive m, X(m) has the same distribution as X rescaled by m. H. – PDF{Xat}=PDF{ m. H{Xt} }. • X is Second-order-self-similar, if (m)(k) of the series X(m) for all m. – Var(X(m) ) = 2 m-β , and – r(m) (k) = r(k), k 0 [Asymptotically: r(m) (k) r(k), m ] • Degree of self-similarity is expressed as the speed of decay of series autocorrelation function using the Hurst parameter 11

Graphic Tests, e. g. Variance-time plots • The variance of X(m) is plotted v. s. m on log-log plot • Slope (-b) > – 1 indicates of SS • H = 1 - b /2 – LRD, ½ < H < 1 – Degree of SSLRD increases as H g 1 Log( Var(X(m) ) ) = log( 2 m-β) =2 log - βlog m Y X b=0. 6, H=0. 7 LRD b=1, H=0. 5 SRD H increases, more bursty 12

Modeling Self-Similarity • Superposition of High Variability ON-OFF Sources – Extension to traditional ON-OFF models by allowing the ON and OFF periods to have infinite variance (high variability or Noah Effect) X 1(t) off X 2(t) on on X 3(t) off 3 2 S 3(t) 13 1 2 1 time 0

Explanation of Self-Similarity • Consider a set of processes which are either ON or OFF – The distribution of ON time is heavy tailed (wide range of different values, including large values with non-negligible probability) • The size of files on a server are heavy-tail • The transfer times also have the same type of characteristics. – The distribution of OFF time is heavy tailed • Since some source model phenomena that are triggered by humans (e. g. HTTP sessions) have extremely long period of latency. 14

Impact on Network Performance • Self-similar burstiness can lead to the amplification of packet loss. • The burstiness cannot be smoothed. • Limited effectiveness of buffering – queue length distribution decays slower than exponentially v. s. the exponential decay associated with Markovian input 15

Impact, contd. • Mean queue length v. s. buffer capacity at a bottleneck router when fed with self-similar input with varying degrees of LRD but equal traffic intensity • 2 : weak Long Range Correlation, buffer capacity of about 60 k. B suffices to contain the input’s variability, the average buffer occupancy remains below 5 k. B [Kihong Park, Walter Willinger] 16 • 1: strong LRC, increase in buffer capacity accompanied by increase in buffer occupancy

Adapt to self-similarity • Flexible resource allocation – Increase bandwidth to accommodate periods of “burstiness”. Could be wasteful in times of low traffic intensity adaptive adjustment can be effective counter measure. – Increase the buffer size to absorb periods of “burstiness”. – Tradeoff, increase both appropriately. 17

Current Status • Many people (vendors) chose to ignore selfsimilarity • People want to blame the protocols for observed behavior • Multi-resolution analysis may provide a means for better models • Lots of opportunity!! 18

Questions? 19