Visualizing Multi-dimensional Clusters, Trends, and Outliers using Star Coordinates Author : Eser Kandogan Reporter : Tze Ho-Lin 2007/5/9 SIGKDD, 2001 1
Outline Motivation n Objectives n Methodology: Star Coordinates n Interaction techniques n Evaluation n Conclusion n Personal Comments n 2
Motivation n Real datasets contain typically more than three attributes of data, representing and making sense of multi-dimensional data has been challenging. 3
Objectives n The objective for this paper is to relieve the dimensionality curse on knowledge discovery through simple data representations that are derived from familiar and easy to understand lower dimensional representations. 4
Conclusion n Star Coordinates, aims to let a representation of the higher dimensional space built on the wellknown simple representations and also through dynamic interactions that allow users to discover trends, outliers, and clusters easily. 9
Personal Comments n Application q n Advantage q n Data visualization Simple & Easy to understand Disadvantage q The figures in this paper is rough. 10
Scaling 11
Rotation 12
Range Selection 13
Histogram 14
Footprints 15
Sticks 16
Evaluation - Figure 12 17
Evaluation - Figure 13 18
Evaluation - Figure 14. Data point distribution after removing state, area code, phone number, and total minute and calls for day, evening, night, and international calls. 19
Evaluation - Figure 15. Data point partitioned into four clusters based on international service plan and voice plan membership. 20
Evaluation - Figure 16. Total day charge and number of customer service calls play the most significant role in churn for customers without international and voice mail service plans‧. 21