Quick Simple Introduction to Multidimensional Scaling n Professor

What is Multidimensional Scaling? A student’s definition: n If you are interested in how

MDS Solution … to produce a SOLUTION, consisting of : 1. a CONFIGURATION, which

Distances & Maps n Given a map, it’s easy to calculate the (Euclidean) distances

What is like MDS? Related and Special-case Models: n Metric Scalar Products Models: n

How does MDS differ from other Multivariate Methods? Compared to other multivariate methods, MDS

How does MDS differ from other Multivariate Methods (2)? Compared to other multivariate methods,

Weaknesses in MDS n n Relative ignorance of the sampling properties of stress prone-ness

CHARACTERIZATION OF BASIC MDS & TERMINOLOGY Structure of MDS specifiable in terms of D-T-M

CHARACTERIZATION OF BASIC MDS (2) TRANSFORMATION (form or type of rescaling performed on data)

CHARACTERIZATION OF BASIC MDS (3) n MODEL: Euclidean Distance where x(i, a) is the

Types of Analysis INTERNAL: If the analysis depends solely on the input data, it

Slides: 13

Download presentation

Quick & Simple Introduction to Multidimensional Scaling n Professor Tony Coxon n Hon. Professorial Research Fellow, University of Edinburgh ( apm. coxon@ed. ac. uk ) see www. tonycoxon. com for information on me n see www. newmdsx. com for information resource on MDS and New. MDSX programs/doc. n See: n n “The User’s Guide to MDS” and “Key Texts in MDS” (readings), Heineman 1982 Available as pdf at £ 15 from newmdsx

What is Multidimensional Scaling? A student’s definition: n If you are interested in how certain objects relate to each other … and if you would like to present these relationships in the form of a map then MDS is the technique you need” (Mr Gawels, KUB) A good start! MDS is a family of models structured by D-T-M: n n (DATA) the empirical information on inter-relationships between a set of “objects”/variables are given in a set of dis/similarity data (TRANSFORMATION) which are then re-scaled ( according to permissible transformations for the data / level of measurement) , in terms of (MODEL) the assumptions of the model chosen to represent the data

MDS Solution … to produce a SOLUTION, consisting of : 1. a CONFIGURATION, which is a n i. pattern of points representing the “objects” n ii. located in a space of a small number of dimensions (hence SSA – “Smallest-Space Analysis”) n iii. where the distances between the points represent the 1. iv. as perfectly as possible dis/similarities between the data-points (the imperfection/badness of fit is measured by Stress) n “Low stress is desirable; No stress is perfection”

Distances & Maps n Given a map, it’s easy to calculate the (Euclidean) distances between the points : n MDS operates the other way round: Given the “distances” [data] find the map [configuration] which generated them n n … and MDS can do so when all but ordinal information has been jettisoned (fruit of the “non-metric revolution”) even when there are missing data and in the presence of considerable “noise”/error (MDS is robust). MDS thus provides at least n [exploratory] a useful and easily-assimilable graphic visualization of a complex data set (Tukey: “A picture is worth a thousand words”)

What is like MDS? Related and Special-case Models: n Metric Scalar Products Models: n n n Metric and Non-Metric Ultrametric Distance, Discrete models o o o n *Simple (2 W 2 M) and Multiple (3 W) Correspondence Analysis BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO OFFERS ORDINAL EQUIVALENTS OF: o o n *Hierarchical Clustering *Partition Clustering (CONPAR) Additive Clustering ( 2 and 3 -way) Metric Chi-squared Distance Model for 2 W 2 M and 3 W data / Tables o n *PRINCIPAL COMPONENTS ANALYSIS FACTOR ANALYSIS (+ communalities) *ANOVA other simple composition models …* UNICON (All models with asterisk * exist as programs within New. MDSX)

How does MDS differ from other Multivariate Methods? Compared to other multivariate methods, MDS models are usually: n distribution-free n n make conservative (non-metric) demands on the structure of the data, are relatively unaffected by non-systematic missing data, can be used with a very wide variety of types of data: n n n (though MLE models do exist – Ramsay’s MULTISCALE) direct data (pair comparisons, ratings, rankings, triads, sortings) derived data (profiles, co-occurrence matrices, textual data, aggregated data) measures of association/correlation etc derived from simpler data, and tables of data. range of transformations n monotonic (ordinal), linear/metric (interval), but also log-interval, power, “smoothness” – even “maximum variance non-dimensional scaling” (Shepard)

How does MDS differ from other Multivariate Methods (2)? Compared to other multivariate methods, MDS models are also offer: n n range of models (chiefly distance (Euclidean, but also City-block), factor/vector (scalar-products), simple composition (additive). Also there are hierarchies of models: n n n Similarity models: 2 W 1 M METRIC – 3 W 2 M INDSCAL – IDIOSCAL (honest!) Preference models : Vector-distance-weighted distance-rotated, weighted (PREFMAP) Procrustes rotation for putting configurations into maximum conformity, and then increasingly complex transformations: PINDIS the solutions are visually assimilable & readily interpretable the structure is not limited to dimensional information – also other simple structures (“horseshoes”, radex/circumplex, clusters, directions).

Weaknesses in MDS n n Relative ignorance of the sampling properties of stress prone-ness to local minima solutions n n n There ARE any? ? ! (but less so, and interactive programs like PERMAP allow thousands of runs to check) a few forms of data/models are prone to degeneracies (especially MD Unfolding – but see new PREFSCAL in SPSS) difficulty in representing the asymmetry of causal models n n though external analysis is very akin to dependent-independent modelling, there are convergences with GLM in hybrid models such as CLASCAL (INDSCAL with parameterization of latent classes)

CHARACTERIZATION OF BASIC MDS & TERMINOLOGY Structure of MDS specifiable in terms of D-T-M DATA (specifies input data shape and content) DATA MATRIX INPUT: n n WAY: ‘dimensionality’ of array (2, 3, 4. . . ) MODALITY: No of distinct sets (to be represented) (1, 2, 3 …) n n NB: Modality < or = Way Common examples: n n n 2 W 1 M 2 W 2 M 3 W 2 M basic models (LTM, UTM, FSM) rectangular, joint (conditional )mapping (“stack” of 2 W 1 M) Individual differences Scaling

CHARACTERIZATION OF BASIC MDS (2) TRANSFORMATION (form or type of rescaling performed on data) o Non-Metric /Ordinal: = M(d) § Monotonic Increasing (sims) or Decreasing (dissims) · Order/inequality o Strong / Guttman: (j, k) > (l, m) -> d(j, k) > d(l, m) o weak/Kruskal: (j, k) > (l, m) -> d(j, k) d(l, m) · Equality / ties o Primary (j, k) = (l, m) -> d(j, k) = OR d(l, m) o 2 ndary (j, k) = (l, m) -> d(j, k) = d(l, m) o Metric / Linear § Linear: = L(d) § = a + b(d)

CHARACTERIZATION OF BASIC MDS (3) n MODEL: Euclidean Distance where x(i, a) is the co-ordinate of point i on dimension a in the solution configuration X of low dimension n The basic model is Euclidean distance, but other Minkowski metrics are available, including: n City Block Model

(Badness of) FIT: Stress

Types of Analysis INTERNAL: If the analysis depends solely on the input data, it is termed “internal”, but n EXTERNAL: If the analysis uses additionally to the input data / solution information relating to the same points (but from another source), it is termed “external”. n