Chapter20 Cluster Analysis Cluster analysis is a class
- Slides: 8
Chapter_20 Cluster Analysis Cluster analysis is a class of techniques used to classify objects or cases into relatively homogeneous groups called clusters. Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters. Areas of cluster Analysis: 1. 2. 3. 4. 5. Segmenting the market Understanding buyer behavior Identifying new product opportunities Selecting test markets Reducing data Naresh K. Malhotra Marketing Research-an applied orientation, 4 th ed.
Conducting Cluster Analysis It is a six-step process. Formulate the problem Select a distance measure Select a clustering procedure Decide on the number of clusters Interpret and profile clusters Assess the validity of clustering
Cluster Analysis Process Selecting the variables on which the clustering is based. Formulate the Problem Inclusion of even one or two irrelevant variables may distort. The distance measure determines how similar or dissimilar the objects being clustered are. The objective of clustering is to group similar objects together. Select a distance or similarity measure The methods are: 1. Euclidean distance or its square. 2. City-block or Manhattan distance 3. Chebychev distance
Cluster Analysis Process Select a clustering procedure Two approaches: 1. Hierarchical a. Agglomerative i. Linkage methods - Single linkage - Complete linkage - Average linkage ii. Variance methods (Ward’s method) iii. Centroid methods b. Divisive 2. Nonhierarchical a. Sequential threshold b. Parallel threshold c. Optimizing partitioning
Cluster Analysis Process Decide on number of Clusters The methods are: 1. Theoretical, conceptual or practical considerations 2. Hierarchical clustering 3. Nonhierarchical clustering 4. The relative sizes of the clusters should be meaningful
Cluster Analysis Process Interpret and profile the clusters It involves examining the cluster centroids. The centroids represent the mean values of the objects contained in the cluster on each of the variables. The variables that significantly differentiate between clusters can be identified via discriminant analysis and one way analysis of variance.
Cluster Analysis Process Asses reliability and validity 1. Perform cluster analysis on the same data using different distance measure. 2. Use different methods of clustering and compare the results. 3. Split the data randomly into halves. 4. Delete variables randomly.
Cluster Analysis Process: SPSS Windows 1. Analyze > Classify > Hierarchical Cluster Finding 2. Statistics > Check on Agglomeration schedule and Check on Range of Schedule, solutions > 2 -4 (for 2 -4 clusters) Cluster Member 3. Plots > check on dendogram and ICICLE 4. Method > Clusters method > ward’s Plot method 5. OK