A Predictionbased Fair Replication Algorithm in Structured P

A Prediction-based Fair Replication Algorithm in Structured P 2 P Systems Xianshu Zhu, Dafang Zhang, Wenjia Li, Kun Huang Presented by: Xianshu Zhu College of Computer & Communication, Hunan University, P. R. China

Outline Introduction Contribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work

Introduction Query Hotspot Structured Peer-to-Peer Network Summary of Replication Schemes

Query Hotspot: the number of requests for popular objects increases dramatically, and leads to consequent dropping queries and severe performance failures. E D File C B H Query Hotspot F G I J

Structured P 2 P Network Advantage： - Scalability - Efficient Searching Disadvantage：The Implementation of Structured P 2 P Network Assumes that All Data Items are of the Same Popularity. No Mechanism Can Handle Hotspot Problem

Replication Schemes Basic Idea： - Distribute Replicas of the Popular Data Items to Various Light-loaded Nodes - Fairly Distribute Load onto Each Node. When Apply Replication Technique: －Replica Creation: Time, Number, Location －Replica Utilization

Replication Schemes Classification According to Replica Location: - Path Replication High Replication Overhead - Owner Replication - Random Replication File A File B File C File D File E File F

Replication Schemes Classification According to Replica Location: - Path Replication - Owner Replication: Gopalakrishnan proposed LAR - Random Replication File A File B File A A File B B C File D D E File D F 1. New Query Hotspot 2. Low Replication Speed

Replication Schemes Classification According to Replica Location: - Path Replication - Owner Replication - Random Replication File A B File C File D E File F

Outline Introduction Contribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work

Contribution Design Goals: - Dropped Queries by Only Introducing Minimum Replication Overhead - Minimize the Drawbacks of LAR Algorithm (Owner Replication) Prediction-based Fair Replication Algorithm (PFR) that Can Almost Fairly Distribute Load onto Each Node, So As to Meet the Above Design Goal.

Contribution Fairness Goal of PFR -Adaptively Determine the Replication Speed and Replication Location According to Node’s Predicted Load Fraction A B C D E F G

Outline Introduction Contribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work

PFR- Appropriate Replication Time To keep the System Performance at a High Level, Preventive Actions Should be Taken Before Query Hotspot Really Happens Period Exponential Weight Prediction Algorithm Predict(n+1)=Current(n) + Predict. Diff(n+1) Predicted Possible Traffic Difference Between nth and (n+1)th Interval Predict(n+1) n+1 n Current Time n-1 2 1

PFR- Appropriate Replication Time Period Exponential Weight Prediction Algorithm - Only Incurs Low Computation Overhead - Applicable to Online Prediction Our Replication Strategy is Set Based on The Predicted load

PFR- Fairly-decided Replication Speed: File A File B C File D File E 3/6 F Replication Speed=(the Number of Nodes Chosen to Hold Replicas)/(the Number of All Nodes that Have Encountered Along the Query Path)

PFR- Fairly-decided Replication Speed Replication Level: Predicted Load Fraction (0. 8 (1) (0. 7 ) )(0. 6 ) (0. 5) (0. 3) Node Homogeneity Replication Speed N 3 N/4 N/2 N/4 1 DON’T create replicas N: Total Number of Nodes Along a Query Path

PFR- Replication & Replica Utilization D: A N=6 G A: File A A: File B C A: File D C: File E D: 0. 3 E: 0. 15 A: 0. 9 B: 0. 3 C: 0. 55 D: 0. 3 E: 0. 15 B: 0. 3 F: 0. 25 C: 0. 55 E: 0. 15 D: 0. 3 F: 0. 25 E: C C: 0. 55 D: 0. 3 E: 0. 15 F: 0. 25 E: C B, D, E, F: A E: 0. 15 F: 0. 25 RS: N/4=1 B, D, E, F: A F: 0. 25 B, D, E, F: A E: C RS: N B, D, E, F: A A: File F F: 0. 25 B, D, E, F: A

Outline Introduction Contribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work

Performance Evaluation Highly modified Chord Simulator from MIT and LAR Implementation Code： System Size Number of data Node capacity Node’s queue length 1000 The Time Each Network hop takes 32767 Average system load 10 per sec Number of Queries Generate per Sec 32 Prediction interval 25 ms 25% 500 1 s

Performance Evaluation 90% of the input queries are directed to 1 item 28% LAR PFR Number of Queries Dropped Over Time

Performance Evaluation LAR PFR Total Number of Documents Replicated

Performance Evaluation LAR PFR Total Number of Finger Tables Replicated

Performance Evaluation PFR LAR Total Number of Replica Location Hints Created

Outline Introduction Contribution PFR (Prediction-based Fair Replication) Performance Evaluation Conclusion and Future Work

Conclusion Prediction-based Fair Replication Algorithm Can Conduct Fair Replication through: - Appropriate Replication Time - Fairly-decided Replication Speed - Fairly-decided Replication Location - High Replica Utilization Rate Performance Evaluation: - Notably Decrease the Number of Dropped Queries - Low Replication Overhead

Future Work Taking Node Heterogeneity into Consideration

Thank you!