Diff Probe Detecting ISP Service Discrimination Partha Kanuparthy

Diff. Probe Detecting ISP Service Discrimination Partha Kanuparthy, Constantine Dovrolis

Net Neutrality Recent FCC-ISP debates Comcast throttling dispute, etc. FCC broadband mapping framework Tools to estimate performance $350 m stimulus funds

What is Service Discrimination? ISPs can classify certain apps as low-priority: and service them accordingly Discrimination can manifest as (relatively): high delays high loss rates ISP can also do shaping: leads to low throughput (=> both delay and loss) Shaper. Probe: first step

Goals Problem: Is an application's traffic being classified low-priority by an ISP? Is the ISP doing loss or delay discrimination or both? Can we identify scheduler type? Solution: Compare performance of normal and application traffic sent simultaneously Diff. Probe Identifying discrimination is not easy: 1. Congestion events can be short-lived (us-ms scales) – 2. Bad idea: compare delays/loss rates from different times Customer may see same performance if there is no cross-traffic – Bad idea: call this as no-discrimination server

Delay Discrimination: Practice Non-discriminatory schedulers (single queue): First-Come-First-Serve (FCFS) Discriminatory schedulers (multiple classes): Strict Priority (SP) Weighted Fair Queuing (WFQ) WRR Delay discrimination creates difference in delay distributions

Loss Discrimination: Practice Non-discriminatory buffer managers: Drop. Tail (DT) Random Early Detect (RED) Discriminatory buffer managers: Weighted RED (WRED) Drop-from-Longest-Queue Loss discrimination creates difference in loss rates Drop-from-Longest

Rest of the talk… High level design Detecting delay discrimination Detecting loss discrimination The Diff. Probe tool Shaper. Probe

High-level design Send normal (P) and application (A) traffic simultaneously Measure one-way delays (OWDs) and lost packets for each flow Diff. Probe server Application traffic (A) Normal traffic (P)

Avoiding Classification A flow: P flow has to be: . . . sufficiently different from A to avoid classification , Ex: alter payload, ports, gaps sufficient similar to A to observe same network performance as P when there is no discrimination same packet size distribution between A and P send a P packet at about same time as A

Probing Patterns Create two probing structures using A and P: Balanced Load Period (BLP): send both flows at their normal rates A P Load Increase Period (LIP): scale up P flow's rate A P Why create LIP? To maximize chances of queuing in ISP network

Discrimination Identifiability The user does not always “see” discrimination no high-priority backlog “=>” Low-priority gets link capacity We use BLP to detect unidentifiable conditions for delay discrimination P delays created during LIP are larger than BLP 90 th percentile of P's delays during LIP > median of P's delays during BLP

Overview High level design Detecting delay discrimination Detecting loss discrimination The Diff. Probe tool

Detecting Delay Discrimination We observe empirical delay distributions of A and P flows during LIP: FCFS and (Comcast) No delay discrimination: Delay discrimination: WRR 1: 3 (emulated)

Detecting Delay Discrimination (2) Pre-processing: Pairing: Consider only those (A, P) sample pairs which were sent within an MTU-transmission time, τ A P Discard delay values in τ-neighborhood of estimated propagation delay such samples don't see queuing Subtract propagation delay estimate from samples

Detecting Delay Discrimination (3) Hypothesis test for : 1. Null hypothesis: equal distributions 2. Compute Kullbeck-Leibler (KL) Divergence of pre-processed samples 3. Compute KL Divergences of uniform random partitions of 4. Is (2) > (3)? • Test for Compare all higher percentiles (50 th - 90 th) of A and P delay distributions Redo the test, swapping A and P as inputs If this test fails, we state that delay discrimination is unknown

Delay Discrimination: Accuracy Evaluate using simulations: Discrimination using SP and WFQ Skype i. SAC packet trace as A flow FCFS, SP, WFQ Cross-traffic: interactive TCP sessions (200 users) Half of user traffic classified low-priority BLP, LIP durations: 30 s 1: 1. 5 is similar to FCFS 90+% accuracy among detectable trials WFQ weights 95% confidence, 2% error margin

SP or WFQ? SP-like or WFQ-like scheduling create diff. delays Idea: some P packets serviced just after A would: see only A's non-preemption delay (if any) in SP but, see A's queuing delays in WFQ Low-priority non-preemption SP Method: choose a subset of P samples: received very close but after an A packet queuing WFQ 1: 2 SP WFQ Distribution of P subset

Overview High level design Detecting delay discrimination Detecting loss discrimination The Diff. Probe tool

Detecting Loss Discrimination Estimate loss rates of A and P flows during LIP as fraction of packets lost: and No loss discrimination: Loss discrimination: WRR 1: 3 Drop-Longest-Queue (emulated)

Detecting Loss Discrimination (2) Pre-processing: to estimate Pairing: same as that for delay discrimination and ensure the A and P flows sample the same congestion events if Drop. Tail/RED Use the Two-Proportion Test on and Unidentifiability: less than 10 dropped packets in each flow

Loss Discrimination: Accuracy Buffer sizes according to BW -Delay product similar loss rates 90+% accuracy for discriminating configurations WRED accuracy f: Min queue threshold of normal flows: WFQ 1: 1. 5 is similar to DT Drop-Longest-Queue (WFQ) vs. DT

Overview High level design Detecting delay discrimination Detecting loss discrimination The Diff. Probe tool

Implementing Diff. Probe runs as client-server (~7500 Lo. Cs) Classifier types: port, payload A flow: Skype and Vonage voice traces P flow: randomize payload, port of A flow LIP, BLP durations: 30 s each Pre-probing: estimate path capacity using packet trains

Experiments Emulations: discriminating link configured using tc Pareto cross-traffic SP, WRR, and Drop-Longest-Queue discriminators No FPs, FNs Real-world experiments (Skype and Vonage): We do not have ground truth A high p-value of KL-test is a good “indicator” of nodiscrimination One ISP showed multi-path routing, which created different delays KL-test p-values: Access ISP runs

Validation ISPs have so far not disclosed details of application discrimination practices (if any) No ground truth! Discrimination: significant difference in delays and/or losses of A and P Why? : controlled environment trials! Validation ideas?

Overview High level design Detecting delay discrimination Detecting loss discrimination The Diff. Probe tool

Shaper. Probe A pre-probing module of Diff. Probe to answer: Can we detect traffic shaping by ISPs? What is the shaping configuration? Key idea: probe and detect level shifts in rate the token bucket signature Upload: 7 Mbps -> 2 Mbps in 8 s

Shaper. Probe (contd. ) Deployed at Google M-Lab 60, 000+ runs so far Who shapes traffic? . . . among 700+ other ASes.

Thank You! partha @ cc. gatech. edu

Detecting Delay Discrimination (3) Hypothesis test for Null hypothesis: equal distributions Compute Kullbeck-Leibler (KL) Divergence of preprocessed samples call it Bootstrap: compute KL Divergences of uniform random partitions of this gives us a KL distribution Reject null hypothesis if p-value is < 0. 05: :

Detecting Delay Discrimination (4) Test for (if KL-test rejects null hypothesis) Compare higher percentiles of A and P delay distributions Redo the test, swapping A and P as inputs If this test fails, we state that delay discrimination is unknown

SP or WFQ? (2) For the distribution of this subset of P samples: SP if: 95 th percentile P delay ≈ 5 th percentile WFQ-like, otherwise SP WFQ Distribution of P subset WFQ-SP accuracy