VOStat Arming Astronomers with Advanced Statistics Caltech A

  • Slides: 19
Download presentation
VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham, S. G. Djorgovski,

VOStat Arming Astronomers with Advanced Statistics Caltech: A. Mahabal, M. Graham, S. G. Djorgovski, R. Williams Penn State: J. Babu (PI), E. Feigelson CMU: R. Nichol, D. Van Den. Berk, L. Wasserman VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Use of statistics • 15000 astronomical studies per year • 5% have “statistics” in

Use of statistics • 15000 astronomical studies per year • 5% have “statistics” in their abstract • 20% treat variable objects or multivariate datasets VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Traditional methods • Fourier transform (Fourier 1807) • Least sq. and chisq (Legendre 1805,

Traditional methods • Fourier transform (Fourier 1807) • Least sq. and chisq (Legendre 1805, Pearson 1901) • Kolmogorov-Smirnov test (Kolomogrov 1933) • Principal Component Analysis (Hotelling 1936) VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

VOStat • • • Web based service Simple and sophisticated statistical routines Large datasets

VOStat • • • Web based service Simple and sophisticated statistical routines Large datasets Public domain (R)/ specially written General purpose and Virtual Observatory VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

VOStat • ASCII / VOTABLE as input (can be used as an intermediate block

VOStat • ASCII / VOTABLE as input (can be used as an intermediate block for a VO based pipeline) • CGI routines as prototypes (few 1000 lines) • Webservices (Java GUI) - hundreds of thousands of lines (limited by R’s capabilities) - distributed, multi-OS, multi-language VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Examples of available functions • Descriptive statistics (e. g. boxplot) • Two- and k-sample

Examples of available functions • Descriptive statistics (e. g. boxplot) • Two- and k-sample tests (e. g. Wilcoxon rank-sum test) • Density estimation (e. g. Kernel smoothing) • Correlation and regression (e. g. PCA) • Censored data (e. g. Survival) • Multivariate classification (e. g. H clustering) • External functions (e. g. K-density) VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

User-friendly GUI • Columns are autoselected (and can be deselected) • Parameter choices for

User-friendly GUI • Columns are autoselected (and can be deselected) • Parameter choices for functions are conveniently placed • Can be used from your own webpages on tables residing elsewhere VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Toy Demos • Rediscovering HR diagram • Rediscovering FP of Globular Clusters • Looking

Toy Demos • Rediscovering HR diagram • Rediscovering FP of Globular Clusters • Looking for outliers in color-color space VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Rediscovering HR diagram • • Hyades stars (Hipparcus main catalog) Mean/median/boxplot Density estimation (Histogram)

Rediscovering HR diagram • • Hyades stars (Hipparcus main catalog) Mean/median/boxplot Density estimation (Histogram) Kernel smoothing Correlation matrix X-Y plot Multivariate clustering VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

 • X-Y plot between Vmag and B-V reveals the famous structure in the

• X-Y plot between Vmag and B-V reveals the famous structure in the dataset: the color-magnitude of bright stars showing the main sequence, giant branch (with red clump stars), and a few Hyades white dwarfs. VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

FP of Globular clusters • Matrix of pairwise correlation coefficients • Pairwise plots •

FP of Globular clusters • Matrix of pairwise correlation coefficients • Pairwise plots • Principal Component Analysis VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

 • Core parameters as a group tend to be highly correlated, unlike the

• Core parameters as a group tend to be highly correlated, unlike the half-light parameters. This is indicative of the dynamical evolution driven by the core collapse. VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Exploring outliers • Palomar-QUEST synoptic sky survey • 9 mix-and-match colors from 8 filters

Exploring outliers • Palomar-QUEST synoptic sky survey • 9 mix-and-match colors from 8 filters • Aim: finding outliers in color-color space for spectroscopic follow-up • 1000 random objects VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Boxplot • Reveals relationships between colors (mean, median, overlap, outliers) VOStat - HEAD 2004

Boxplot • Reveals relationships between colors (mean, median, overlap, outliers) VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Clustering • K-means provides various cluster centers along withinss and a list of possible

Clustering • K-means provides various cluster centers along withinss and a list of possible outliers VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

K-density • Probability - density association for outliers VOStat - HEAD 2004 Ashish Mahabal

K-density • Probability - density association for outliers VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Visual confirmation (found from 1000 random objects) VOStat - HEAD 2004 Ashish Mahabal http:

Visual confirmation (found from 1000 random objects) VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360

Summary • Web-based • VO compatible • Public domain and specialized routines VOStat -

Summary • Web-based • VO compatible • Public domain and specialized routines VOStat - HEAD 2004 Ashish Mahabal http: //www. vostat. org NSF DMS-0101360