How to represent coverage temporal spectral positional Clive

  • Slides: 11
Download presentation
How to represent coverage: temporal, spectral, positional Clive Page Astro. Grid Project University of

How to represent coverage: temporal, spectral, positional Clive Page Astro. Grid Project University of Leicester 2003 March 19.

Registry properties • Registry does not contain definitive information on coverage of any dataset.

Registry properties • Registry does not contain definitive information on coverage of any dataset. • False positives are not of great concern: at worst they waste some search time. • False negatives, however, may fail to locate data needed by the user. • Registry never answers “yes” or “no”, only “maybe” or “no”.

Typical query • Find all datasets which may contain information about objects: – Located

Typical query • Find all datasets which may contain information about objects: – Located in <small patch of sky> – observed in <wavelength range> – between <date 1> and <date 2> –. . . • All three of these imply range searches.

(1) Coarse-grained Registry • One entry per resource or data collection, – E. g.

(1) Coarse-grained Registry • One entry per resource or data collection, – E. g. one for whole HST-WFPC 2 data archive. – Perhaps only 100 to 1000 entries in total. • Should answer questions like: – Is there any HST-WFPC 2 field covering <position> – If answer is “maybe” then have to scan the HST archive for more information

(2) Fine-grained Registry One entry per observation per instrument. – Perhaps 10^5 to 10^6

(2) Fine-grained Registry One entry per observation per instrument. – Perhaps 10^5 to 10^6 entries in whole registry • Should answer questions like: – Which HST-WFPC 2 fields (if any) cover <position> – May store spatial, spectral, and temporal metadata about each observation. – Can directly retrieve required dataset • Much more selective, useful scientifically.

Time • Simplest representation: for each data resource store the start and end dates

Time • Simplest representation: for each data resource store the start and end dates of observation (e. g. from FITS DATEOBS, DATE-END keywords). • User’s query may contain range of dates of interest: Registry can easily find all resources for which the ranges overlap. • A more complex representation might be needed if: – Single resource has sparse coverage of time axis, e. g. many single observations over a long period of time – User has a long list of dates of interest, e. g. wants to find observations of some periodic phenomenon.

Wavelength range • Could represent by an enumerated list of attributes, e. g. –

Wavelength range • Could represent by an enumerated list of attributes, e. g. – Radio, optical, x-ray, etc. • But – Bands not very well defined, e. g. where does soft X-ray turn into XUV? – Probably need to subdivide: e. g. near UV, far UV, but where do you stop? – Even with many terms, it does not provide much selectivity for queries.

Wavelength representation • Proposed solution as for time: for each resource store <low> and

Wavelength representation • Proposed solution as for time: for each resource store <low> and <high> wavelengths/frequencies observed, specify <low> <high> values for queries. • High selectivity: can even find e. g. neutral hydrogen surveys by specifying suitable ranges. • Units need to be agreed: Hertz, metres, electron volts? Inter-conversion is easy.

Wavebands – use bitmap? • Patricio Ortiz has suggested using a bitmap for wavelength.

Wavebands – use bitmap? • Patricio Ortiz has suggested using a bitmap for wavelength. • Observable spectrum covers ~15 decades. – 2 bits/decade: 30 bit word bit-mask - 20 bits/decade: 300 bits or 38 bytes. • To find matching resources do bit-wise AND of the user’s bitmask with that of each resource.

Sky Coverage • We should try to handle instrumental archives, with individual pointings, and

Sky Coverage • We should try to handle instrumental archives, with individual pointings, and find which ones match the user’s query. • Examples: – HST ~20, 000 pointings, each covering 10 sq. arcmin. – XMM-Newton: ~1000 pointings, each 0. 2 sq degs. • Two-dimensional problem, not as simple.

Sky coverage representations • Store (RAmin, RAmax, DECmin, DECmax): – Suitable for finding which

Sky coverage representations • Store (RAmin, RAmax, DECmin, DECmax): – Suitable for finding which HST field covered this position. – Not suitable for answering a question as to whether any HST field ever covered this position. • Bitmap: – 1º resolution >41, 253 pixels, a bitmap of 5 k bytes. – OK to represent coverage of all observed fields of a given instrument, inefficient if used for individual fields. • List of pixels: – using HTM or HEALPix to convert (RA, DEC) integer. – OK for single fields, inefficient for a whole collection of data.