Licentiate Seminar On Measurement and Analysis of Internet

  • Slides: 24
Download presentation
Licentiate Seminar: On Measurement and Analysis of Internet Backbone Traffic Wolfgang John Department of

Licentiate Seminar: On Measurement and Analysis of Internet Backbone Traffic Wolfgang John Department of Computer Science and Engineering Chalmers University of Technology Göteborg, Sweden

Why measure Internet traffic? (1) The Internet is changing in size Internet, 1983 2005

Why measure Internet traffic? (1) The Internet is changing in size Internet, 1983 2005 ARPANET, 1969 Licentiate Seminar Wolfgang John 2008 -02 -29

Why measure Internet traffic? (2) The Internet is changing in application Licentiate Seminar Wolfgang

Why measure Internet traffic? (2) The Internet is changing in application Licentiate Seminar Wolfgang John 2008 -02 -29

Why measure Internet traffic? (3) • The Internet – is constantly developing – is

Why measure Internet traffic? (3) • The Internet – is constantly developing – is used differently in different locations – is heterogeneous NET INTERconnected NETworks INTER The Internet is not understood in its entirety! Licentiate Seminar Wolfgang John 2008 -02 -29

Why measure Internet traffic? (4) • Operational purpose – Troubleshooting, provisioning, planning …. •

Why measure Internet traffic? (4) • Operational purpose – Troubleshooting, provisioning, planning …. • Scientific purpose – Protocols, infrastructure and services – Performance properties – Internet simulation models – Security measures Licentiate Seminar Wolfgang John 2008 -02 -29

Thesis Objectives 1. Guidelines for Internet measurement 2. Current traffic characteristics 3. Traffic decomposition

Thesis Objectives 1. Guidelines for Internet measurement 2. Current traffic characteristics 3. Traffic decomposition 4. Inconsistent behavior Licentiate Seminar Wolfgang John 2008 -02 -29

Outline • • • Measurement approaches Internet measurement challenges The Mon. Net project Scientific

Outline • • • Measurement approaches Internet measurement challenges The Mon. Net project Scientific contribution Results Measurement Analysis – Four studies included • Conclusions Licentiate Seminar Wolfgang John 2008 -02 -29

Measurement approaches Network traffic measurement Passive Active Software Hardware Online Offline Statistical summaries Flows

Measurement approaches Network traffic measurement Passive Active Software Hardware Online Offline Statistical summaries Flows Complete Packets Headers Different protocol levels Transport layer Licentiate Seminar Wolfgang John 2008 -02 -29

Internet measurement challenges (1) • Legal considerations • Ethical and moral considerations • Operational

Internet measurement challenges (1) • Legal considerations • Ethical and moral considerations • Operational considerations • Technical considerations Licentiate Seminar Wolfgang John 2008 -02 -29

Measurement challenges (3) Technical considerations • Data amount – Exhausting I/O and storage access

Measurement challenges (3) Technical considerations • Data amount – Exhausting I/O and storage access speeds • Data reduction techniques – Filtering, sampling, packet truncation • Timing – Clock synchronization Licentiate Seminar Wolfgang John 2008 -02 -29

The Mon. Net Project (1) Technical Solution Processing Platform and Storage Measurement Node 1

The Mon. Net Project (1) Technical Solution Processing Platform and Storage Measurement Node 1 Göteborg splitter 10 Gbps Borås 10 Gbps Measurement Node 2 Licentiate Seminar Wolfgang John 2008 -02 -29

The Mon. Net Project (2) Measurement location • April 2006 148 traces (20 minutes)

The Mon. Net Project (2) Measurement location • April 2006 148 traces (20 minutes) 11 billion packets, 7. 6 TB of data • Sept. – Nov. 2006 554 traces (10 minutes) 28 billion packets, 19. 5 TB of data Student. Net al n io g Re ISPs Int Stockholm ern et Borås Göteborgs Univ. Chalmers Univ. Other smaller Univ. and Institutes Licentiate Seminar Wolfgang John 2008 -02 -29

Scientific Contribution Level of complexity Packet level Traffic classes Study IV Study III Study

Scientific Contribution Level of complexity Packet level Traffic classes Study IV Study III Study II Quantification of inconsistent behavior Study I Traffic characterization Flow level Upcoming Licentiate Seminar Wolfgang John 2008 -02 -29

Study I: Packet Level Analysis • Updated packet-level characteristics of Internet traffic • Inconsistencies

Study I: Packet Level Analysis • Updated packet-level characteristics of Internet traffic • Inconsistencies in headers will appear – Network attacks and malicious traffic – Active OS fingerprinting – Buggy applications or protocol stacks Licentiate Seminar Wolfgang John 2008 -02 -29

Study II: Flow level analysis • High level analysis does not necessarily show differences

Study II: Flow level analysis • High level analysis does not necessarily show differences → detailed analysis does! • 2 main reasons for directional differences: – Malicious traffic • the Internet is “unfriendly” – P 2 P • Göteborg is a P 2 P source • P 2 P is changing traffic characteristics e. g. packet sizes, TCP termination, TCP option usage Licentiate Seminar Wolfgang John 2008 -02 -29

Study III: Classification Method (1) • Classification of flow traffic without payload • Heuristics

Study III: Classification Method (1) • Classification of flow traffic without payload • Heuristics to identify nature of endpoints • Rules based on connection patterns and port numbers – 5 rules for P 2 P traffic – 10 rules to classify other types of traffic • remove ‘false positives’ from P 2 P Licentiate Seminar Wolfgang John 2008 -02 -29

Study III: Classification Method (2) Comparison of classification methods for P 2 P traffic

Study III: Classification Method (2) Comparison of classification methods for P 2 P traffic # connections in 106 Amount of data in TB Licentiate Seminar Wolfgang John 2008 -02 -29

Study III: Classification Method (3) • Previous classification methods on packet header traces don’t

Study III: Classification Method (3) • Previous classification methods on packet header traces don’t work well on backbone data • Proposal of refined and updated heuristics – Simple and fast method to decompose traffic – No payload required – Effectively used even on short traces (10 min) • 0. 2% of the data left unclassified Licentiate Seminar Wolfgang John 2008 -02 -29

Study IV: Classification Results (1) Tuesday, 18. 04. 2006 Licentiate Seminar Wolfgang John 2008

Study IV: Classification Results (1) Tuesday, 18. 04. 2006 Licentiate Seminar Wolfgang John 2008 -02 -29

Study IV: Classification Results (2) Application breakdown April till Nov. 2006 Licentiate Seminar Wolfgang

Study IV: Classification Results (2) Application breakdown April till Nov. 2006 Licentiate Seminar Wolfgang John 2008 -02 -29

Study IV: Classification Results (3) Connection establishment for traffic classes Licentiate Seminar Wolfgang John

Study IV: Classification Results (3) Connection establishment for traffic classes Licentiate Seminar Wolfgang John 2008 -02 -29

Study IV: Classification Results (4) • Behavior of P 2 P traffic – Unsuccessful

Study IV: Classification Results (4) • Behavior of P 2 P traffic – Unsuccessful TCP connection attempts increasing – Serving peers terminate with FIN and RST Decreased from 20% to 8% – UDP overlay traffic doubled • TCP options deployment differs – P 2 P behaves as expected – Web traffic shows artifacts of client-server patter e. g. popular web-servers neglecting SACK option Licentiate Seminar Wolfgang John 2008 -02 -29

Summary 1. Guidelines for Internet measurement • Experiences of the Mon. Net project 2.

Summary 1. Guidelines for Internet measurement • Experiences of the Mon. Net project 2. Current traffic characteristics • Packet and flow level 3. Traffic decomposition • Traffic classification method 4. Inconsistent behavior • • Packet header anomalies Malicious traffic flows Licentiate Seminar Wolfgang John 2008 -02 -29

General remarks • Internet today is essential, but still not understood entirely • Large-scale

General remarks • Internet today is essential, but still not understood entirely • Large-scale traffic measurements uncommon – A lot of analysis is done on outdated datasets • Each study generated as much questions as answers • Reconsider measurement process (duration, payload…) • A lot of open questions … …get more answers in two years… Licentiate Seminar Wolfgang John 2008 -02 -29