A Hierarchical Characterization of a Live Streaming Media

  • Slides: 22
Download presentation
A Hierarchical Characterization of a Live Streaming Media Workload E. Veloso, V. Almeida W.

A Hierarchical Characterization of a Live Streaming Media Workload E. Veloso, V. Almeida W. Meira, A. Bestavros, S. Jin Proceedings of Internet Measurement Workshop, ACM SIGCOMM, Nov. 2002

Outline Introduction ¢ Data Collection ¢ Client Layer Characteristics ¢ Session Layer Characteristics ¢

Outline Introduction ¢ Data Collection ¢ Client Layer Characteristics ¢ Session Layer Characteristics ¢ Transport Layer Characteristics ¢ Conclusions ¢ 2

Introduction ¢ Workload characterization is important for Performance evaluation and prediction l Capacity planning

Introduction ¢ Workload characterization is important for Performance evaluation and prediction l Capacity planning l ¢ Rejecting client for a live stream is not a viable solution Value of live streams is the liveness l Lose paying customers l 3

Introduction Only a small number of studies on characterizing pre-recorded streaming media workloads ¢

Introduction Only a small number of studies on characterizing pre-recorded streaming media workloads ¢ This paper provides a characterization for live streams ¢ 4

Introduction ¢ Compare to Stored streams, Live streams exhibit Stronger temporal patterns of workload

Introduction ¢ Compare to Stored streams, Live streams exhibit Stronger temporal patterns of workload l Fewer possible VCR functions l Less correlations between different variables l • Users are less likely to stop viewing when Qo. S degrades 5

Data Collection A popular live show in Brazil “Reality TV Show” in early 2002,

Data Collection A popular live show in Brazil “Reality TV Show” in early 2002, last for 90 days ¢ The live streams provided feeds captured from one of the cameras embedded in the environment surrounding the contestants ¢ 6

Data Collection ¢ For each entry of the log, it contains l l l

Data Collection ¢ For each entry of the log, it contains l l l l 7 Client identification—e. g. , IP address, player ID Client environment specification—e. g. , OS version, CPU Requested object identification—e. g. , URI of stream Transfer statistics—e. g. , loss rate, average bandwidth Server load statistics—e. g. , server CPU utilization Other information—e. g. , referer URI, HTTP status Timestamp in seconds of when log entry was generated.

Data Collection 8

Data Collection 8

Client Layer Characteristics Focus on the characteristics of the client population ¢ Clients are

Client Layer Characteristics Focus on the characteristics of the client population ¢ Clients are identified by the unique player ID ¢ 9

Client Topological and Geographical Distribution ¢ 10 Follow a Zipf profile with parameter α=1.

Client Topological and Geographical Distribution ¢ 10 Follow a Zipf profile with parameter α=1. 29, 1. 49 and 5. 4 respectively

Temporal behavior of number of active clients Diurnal Effect on the live content ¢

Temporal behavior of number of active clients Diurnal Effect on the live content ¢ Periodic ¢ Depends on the day of week ¢ 11

Client Arrival Process Client arrival process is not poisson ¢ Can be estimated by

Client Arrival Process Client arrival process is not poisson ¢ Can be estimated by a sequence of piece-wise-stationary Poisson arrival processes ¢ 12 Interarrival time of clients from logs Interarrival time of a piece-wise-stationary Poission process

Client Interest Profile ¢ ¢ 13 Using transfer frequency as a measure of client

Client Interest Profile ¢ ¢ 13 Using transfer frequency as a measure of client interest in the content Client interest in a single object follows a Zipf distirbution

Session Layer Characteristics ¢ ¢ 14 Focus on individual client activity The trace does

Session Layer Characteristics ¢ ¢ 14 Focus on individual client activity The trace does not explicitly identify the delimiters of a session l The authors choose a session timeout parameter Toff to determine the number of sessions l Toff = 3600 seconds

Session ON/OFF Time ¢ ON times are highly variable l l Due to live

Session ON/OFF Time ¢ ON times are highly variable l l Due to live content instead of temporal behaviors Lognormal Session ON Time vs Session starting time ¢ OFF times form ripples around specific values l l 15 In multiple of days => revisting daily or every two days Exponential

Transport Layer Characteristics ¢ ¢ Focus on individual unicast data transfers Temporal behavior of

Transport Layer Characteristics ¢ ¢ Focus on individual unicast data transfers Temporal behavior of no. of concurrent transfers l l 16 Periodic over a weekly and daily period Similar to the temporal behavior of no. of active clients

Temporal behavior of transfer interarrival times Request arrival process is also periodic and non-stationary

Temporal behavior of transfer interarrival times Request arrival process is also periodic and non-stationary ¢ Due to the diurnal behavior ¢ 17

Transfer Length & Client Stickiness Similar to the session ON time ¢ The long

Transfer Length & Client Stickiness Similar to the session ON time ¢ The long tail shows the willingness of the client to “stick” to the live object ¢ 18

Transfer Bandwidth Bounded by congestion Bounded by client connection speed 19

Transfer Bandwidth Bounded by congestion Bounded by client connection speed 19

Representativeness of Findings ¢ interarrival times depends on the content 20 Compared the findings

Representativeness of Findings ¢ interarrival times depends on the content 20 Compared the findings with another live show “Live News & Sports” l Sport news & soccer players interviews l 28558 requests from 12867 distinct clients within 2 weeks

Conclusions ¢ ¢ ¢ 21 Client Layer l Arrival process can be modeled by

Conclusions ¢ ¢ ¢ 21 Client Layer l Arrival process can be modeled by a piece-wise stationary Poisson process l Identity of the client making a request can be modeled by a Zipf distribution Session Layer l ON times follows Lognormal distribution l OFF times follows exponential distribution Transfer Layer l Arrival process can be modeled by a piece-wise stationary Poisson process l Transfer bandwidth is primarily determined by client connection speed while 10% of transfers are being severely limited by congestion

Comments ¢ Piece-wise Poisson Process A good way to model the client arrival process

Comments ¢ Piece-wise Poisson Process A good way to model the client arrival process l But we need a priori knowledge of the average client arrival rate with a number of short period l The client arrival pattern also depends on the content l Hard to be used l 22