Potential Transition Opportunities in Networking and Cloud Computing

  • Slides: 30
Download presentation
Potential Transition Opportunities in Networking and Cloud Computing R. Srikant in collaboration with Atilla

Potential Transition Opportunities in Networking and Cloud Computing R. Srikant in collaboration with Atilla Eryilmaz, Bin Li, Yi Lu, Joseph Lubars, Ness Shroff, Qiaomin Xie

Outline • Wireless Networks • Scheduling algorithm whose delay is nearly insensitive to file-size

Outline • Wireless Networks • Scheduling algorithm whose delay is nearly insensitive to file-size distributions beyond the mean • Importance: Heavy tails and correlations often result in poor network performance • Cloud Computing • Resource-allocation algorithm whose performance is insensitive to job-size distributions beyond the mean • Importance: Heavy tails and correlations often result in poor network performance • Network Deanonymization • Identifying adversaries in social networks using side information • Ongoing work: algorithms whose performance improves as the tail of the node-degree distribution becomes heavier • Importance: Node degree distribution is typically heavy-tailed, and in/out degrees are correlated

Wireless Networks: Military Relevance • Wireless Networks: Of fundamental importance to military communication •

Wireless Networks: Military Relevance • Wireless Networks: Of fundamental importance to military communication • Shared medium: who should communicate when? • How to assign priorities to different users?

Wireless Networks: An Abstraction Each link is represented by a queue The queue contains

Wireless Networks: An Abstraction Each link is represented by a queue The queue contains backlogged packets at the link Very few assumptions are needed about the process generating the packets For example, packets could be generated by successive files with heavy-tailed, highly correlated distributions

Goals 1. Fairness • Like round-robin scheduling • What does this mean for wireless

Goals 1. Fairness • Like round-robin scheduling • What does this mean for wireless networks? 2. Good throughput • If possible, achievable maximum throughput 3. Small delay, even with heavy-tailed traffic • Using typical algorithms, widely varying file sizes can result in large delays.

Our Solution (Part 1) • Service Times

Our Solution (Part 1) • Service Times

Our Solution (Part 2) • Given an interference graph: Low-complexity distributed implementation is possible

Our Solution (Part 2) • Given an interference graph: Low-complexity distributed implementation is possible Distributed implementations may take a long-time to converge to the correct solution Greedy approximations work quite well in practice • Transmit over a set of links L that maximizes

Remarks on the Algorithm • 802. 11: No explicit consideration of workload at a

Remarks on the Algorithm • 802. 11: No explicit consideration of workload at a link • Backoff when channel is busy • Queue-based Max. Weight: Assign weights to links based on the number of backlogged packets; for long-lived flows; no short-term fairness considerations; long-queues receive prioritity for long periods of time • Age-Based Maxweight: Assign weights to links based on the age of the Head-of-the. Line (Ho. L) packet; for file arrivals and departures; no short-term fairness considerations; older flows receive priority for long periods of time • Our Solution: • Maximizes Throughput • Delay is “insensitive” to file size distribution, beyond the mean • Short-term fairness is considered explicitly

Simulation Results: Insensitivity TSLS-based policy 500 450 400 Average delay 350 300 M =

Simulation Results: Insensitivity TSLS-based policy 500 450 400 Average delay 350 300 M = 10 250 M = 50 200 M = 100 150 100 50 0 0 0, 02 0, 04 Arrival rate 0, 06 0, 08 0, 1

Simulation Results: Fairness M = 100 2000 1800 Variance of Interservice time 1600 1400

Simulation Results: Fairness M = 100 2000 1800 Variance of Interservice time 1600 1400 1200 TSLS-based policy 1000 Proportional scheduler 800 age-based policy 600 400 200 0 0 0, 01 0, 02 0, 03 0, 04 0, 05 0, 06 Arrival rate 0, 07 0, 08 0, 09 0, 11

Cloud Computing • Data Centers have racks and racks of servers; at the heart

Cloud Computing • Data Centers have racks and racks of servers; at the heart of most big data mining applications • Each job arrives in the form of a virtual machine (VM): A VM requests a certain amount of CPU, a certain amount of memory, a certain amount of disk space, etc. • Goal: when a VM arrives, which server should you assign it to?

VMs and Servers • Each VM occupies a certain amount of: CPU and memory

VMs and Servers • Each VM occupies a certain amount of: CPU and memory • Each VM can execute a job in multiple highly correlated stages • A new VM cannot be assigned to some servers Server VM 3 Memory VM 2 VM 1 CPU

Random Assignment • When a new VM arrives and multiple servers are available, to

Random Assignment • When a new VM arrives and multiple servers are available, to which server should we assign the VM? • Need a solution with low complexity and good performance • Exhaustive search to perform load balancing is infeasible when there are of thousands of servers • A simple, low-complexity solution: assign VM to a server randomly picked from among the available servers

Performance Measure •

Performance Measure •

Our Solution • Machine 1 Used by VMs Machine 2 vs

Our Solution • Machine 1 Used by VMs Machine 2 vs

Numerical Results

Numerical Results

Remarks on VM Routing Problem • Basic algorithm works for VMs with multi-dimensional resource

Remarks on VM Routing Problem • Basic algorithm works for VMs with multi-dimensional resource requirements • One has to define Best. Fit appropriately: double exponential decay in blocking probability continues to hold • Same idea works if multiple VMs arrive for a single job that needs to be executed in parallel • The dynamic nature of the problem eliminates many of the computational difficulties associated with knapsack/bin packing problems

Graph Matching Social Network Data Anonymous Communication Data Alice Bob Alice Carol

Graph Matching Social Network Data Anonymous Communication Data Alice Bob Alice Carol

We are interested in Very Large Graphs

We are interested in Very Large Graphs

Existing Algorithm • Start with an initial set of “seed” matches Bob Alice Dan

Existing Algorithm • Start with an initial set of “seed” matches Bob Alice Dan A C Carol • In practice, some seeds are usually available D B

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”:

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”: Bob Alice Dan A C Carol D B

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”:

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”: Bob Alice Dan A C Carol D B

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”:

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”: Bob Alice Dan A C Carol D B

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”:

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses”: Bob Alice Dan A C Carol D B

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses:

Existing Algorithm (Part 2) • For each potential match, count the number of “witnesses: ” Bob Alice Dan A C Carol D B

Existing Algorithm (Part 3) • Add the match with the most witnesses to the

Existing Algorithm (Part 3) • Add the match with the most witnesses to the seed set: Bob Alice Dan A C Carol D B

Existing Algorithm (Part 4) • Repeat until no more pairs have witnesses: Bob Alice

Existing Algorithm (Part 4) • Repeat until no more pairs have witnesses: Bob Alice Dan A C Carol D B

Our Contribution (Ongoing Work) • Goal: Improve the accuracy of matches once the algorithm

Our Contribution (Ongoing Work) • Goal: Improve the accuracy of matches once the algorithm has run • Idea: Swap two matches if it improves the number of witnesses Bob Alice Dan A C Carol D B

Results

Results

Ongoing Work • In reality, some edges and nodes may be missing in one

Ongoing Work • In reality, some edges and nodes may be missing in one of the graphs • Graphs can be directed, with correlated in and out degrees • Networks with heavy-tailed degree distributions may be easier to identify: some nodes have very large degrees and can be identified first, and then we can bootstrap from this information to identify others • How can we quantify the above observation? Possibly in terms of the number of automorphisms of a graph? ? ?