GSC 2 2 Classification GSC II Annual Meeting

  • Slides: 15
Download presentation
GSC 2. 2 Classification GSC II Annual Meeting October 2001

GSC 2. 2 Classification GSC II Annual Meeting October 2001

Single Plate Classification Decision tree classifier: – Use ranks to handle plate to plate

Single Plate Classification Decision tree classifier: – Use ranks to handle plate to plate variation – 5000+ objects in training set – OC 1 oblique decision tree (Murthy et al) – Build several decision trees & let them vote – Classification categories star / nonstar / defect

GSC 2. 2 Classification Unlike astrometry and photometry, where one best value was selected

GSC 2. 2 Classification Unlike astrometry and photometry, where one best value was selected per object (per bandpass), GSC 2. 2 classification can combine multiplate information to improve the final classifications, And counter some known weaknesses.

Multi. Plate Voting For each object: • Collect all single-plate measurements – Even from

Multi. Plate Voting For each object: • Collect all single-plate measurements – Even from plates not being exported, eg IV-N • Override defect->nonstar if N(obs)>1 – Matched objects likely to be real objects • Eliminate 25 um scan data, if 15 um data exist – Classifier poorly tuned for these scans • Majority vote of remaining measurements – Voting classifiers is known to improve results • Break ties in favor of nonstars – Compensates for known bias

Auxiliary Information: the Source Status Flag • GSC 2. 2 provides a wealth of

Auxiliary Information: the Source Status Flag • GSC 2. 2 provides a wealth of additional information about each object via the source status flag. • Much of this information is pertinent to the quality of the final classification. • Informed users can further optimize their results (eg, guide star selection) with this auxiliary data.

Status Flag Details: 0987654321 10 digit decimal mask with relevant info Columns 0: blend

Status Flag Details: 0987654321 10 digit decimal mask with relevant info Columns 0: blend status 9: incomplete processing 8: classification voters 7: classification unanimity 654: photometric details (V, J, F) 3: centroider details 21: number of plate observations

Classification and the Status Flag 0: blend status – Poorly tuned for blends =>

Classification and the Status Flag 0: blend status – Poorly tuned for blends => lower confidence 9: incomplete processing – No features computed => lower confidence 8: classification voters – Multiple voters => higher confidence – 25 um voters => lower confidence

Classification and the Status Flag 7: classification unanimity – Unanimous vote => higher confidence

Classification and the Status Flag 7: classification unanimity – Unanimous vote => higher confidence 654: photometric details (V, F, J) 3: centroider details 21: number of plate observations – More voters => higher confidence

Bright Objects • Tycho stars are included in the GSC 2. 2. – Classification

Bright Objects • Tycho stars are included in the GSC 2. 2. – Classification was set to star for these objects – Status flag = 999900 for Tycho stars • GSC 1 data were omitted from the GSC 2. 2 – Classifications were excluded from voting – GSC 1 classifier superior for m<14 • Include GSC 1 classification in next export

Evaluating Performance: Not a simple problem • What to measure? – Correctness; completeness; contamination

Evaluating Performance: Not a simple problem • What to measure? – Correctness; completeness; contamination • Magnitude and latitude variations • What to compare against? – GSCII was constructed because there is nothing comparable to it! – Nonstar <> galaxy – Automatically classified samples are less reliable – Visually classified samples are few and small

NPM/SPM Stars vs magnitude & latitude

NPM/SPM Stars vs magnitude & latitude

NPM/SPM Galaxies vs magnitude & latitude

NPM/SPM Galaxies vs magnitude & latitude

SDSS Stars and Galaxies vs magnitude

SDSS Stars and Galaxies vs magnitude

Accuracy vs the real questions • How complete is my sample of nonstars? •

Accuracy vs the real questions • How complete is my sample of nonstars? • How pure is my sample of stars? • What is the probability that the GSC 2. 2 classification of this object is correct? The answers depend on your sample, as well as on the properties of the catalog. A single quoted accuracy does not suffice.

Accuracy vs the real questions P(Ts|S) = [P(S|Ts)*P(Ts)] / P(S) This formulation is: –

Accuracy vs the real questions P(Ts|S) = [P(S|Ts)*P(Ts)] / P(S) This formulation is: – Responsive to magnitude and latitude variations – Adaptable to a priori effects of sampling – Adaptable to your favorite galaxy model – Computable (we think! - in progress) – Answers the real questions.