APOGEE2 and Data Infrastructure Jon Holtzman NMSU APOGEE

  • Slides: 27
Download presentation
APOGEE-2 and Data Infrastructure Jon Holtzman (NMSU) APOGEE team (Steve Majewski, PI)

APOGEE-2 and Data Infrastructure Jon Holtzman (NMSU) APOGEE team (Steve Majewski, PI)

Top Level Science Requirements First large scale, systematic, uniform spectroscopic study of all major

Top Level Science Requirements First large scale, systematic, uniform spectroscopic study of all major Galactic stellar populations to understand: • chemical evolution at precision, multi-element level (including preferred, most common metals CNO) • tightly constrain GCE and dynamical models (bulge, disk, halo) • access typically ignored, dust-obscured populations grey areas of map have AV > 1 2

Top Level Science Requirements First large scale, systematic, uniform spectroscopic study of all major

Top Level Science Requirements First large scale, systematic, uniform spectroscopic study of all major Galactic stellar populations to understand: • chemical evolution at precision, multi-element level (including preferred, most common metals CNO) • tightly constrain GCE and dynamical models (bulge, disk, halo) • access typically ignored, dust-obscured populations • Galactic dynamics/substructure with very precise velocities • order of magnitude leaps: ~2 -3 orders larger sample than previous high-R GCE surveys ~2 orders more high S/N, high R near-IR spectra ever taken 3

APOGEE Spectrograph APOGEE: cryogenic mulit-object, near-IR, fiber-fed, high resolution spectrograph (-->John Wilson, UVa) •

APOGEE Spectrograph APOGEE: cryogenic mulit-object, near-IR, fiber-fed, high resolution spectrograph (-->John Wilson, UVa) • 300 fibers • R~22000 • 1. 51 - 1. 7 microns across 3 Hawaii-2 RG 2 Kx 2 K near-IR detectors Garrett Ebelke, Mike Skrutskie, John Wilson & Fred Hearty with opened APOGEE-2 N. By G. Damke.

APOGEE Overview Teff ~ 3900, log g ~ 0. 7, different metallicities APOGEE wavelength

APOGEE Overview Teff ~ 3900, log g ~ 0. 7, different metallicities APOGEE wavelength coverage includes lines of 15+ elemental abundances Significant molecular contributions from CO, OH, CN Abundances derived from modeling spectra: automated matching against multidimensional spectral libraries

SDSS-IV/ APOGEE-2: 2014 -2020 Dual Hemisphere Observations

SDSS-IV/ APOGEE-2: 2014 -2020 Dual Hemisphere Observations

APOGEE-2 Overview

APOGEE-2 Overview

APOGEE-2 Survey Plan North South Core 80% 90% Goal 10% 5% Reserve Ancillary: 5%

APOGEE-2 Survey Plan North South Core 80% 90% Goal 10% 5% Reserve Ancillary: 5% 5% 2. 5% “Core” = science driving SRD requirements “Goal” = valued science not in “core”. “Reserve” = unallocated time, allows for evolution in understanding of priorities/opportunities + contingency. “Ancillary” = program modeled on successful APOGEE-1 program. Ancillary science program (+ reserve? ) remain viable routes for new science opportunities.

APOGEE-2 Goal Science • Core science will be centered on: • Stellar ages and

APOGEE-2 Goal Science • Core science will be centered on: • Stellar ages and physics • Bulge • Open clusters through asteroseimology • Disk • Globular clusters • Halo • Satellites (d. Sph, MCs) & gyrochronology • In addition, we will be doing programs on • • • Eclipsing binaries (fibers on existing plates) M dwarfs (fibers on existing plates) Young star clusters (specific plates) Substellar companions (specific plates) KOIs + KOI control sample (specific plates)

Targeting • Main survey targets selected using relatively simple color cuts • Most targets

Targeting • Main survey targets selected using relatively simple color cuts • Most targets selected from 2 MASS catalog • Reddening estimated using mid-IR photometry • Main survey targets selected via dereddened color cut • (J-K)_0 > 0. 5 • (J-K)_0 > 0. 8 • Calibration, goal, and ancillary targets selected as needed

Observing • APOGEE fields use standard SDSS plug plates • Typical field includes 230

Observing • APOGEE fields use standard SDSS plug plates • Typical field includes 230 science targets, 35 hot stars for telluric absorption measurement, 35 sky fibers for sky subtraction • Fields are observed for ~1 hour of exposure per visit, to achieve S/N~100 per half-resolution element at H=12. 2 if observed for three visits; this is usually split into 8 exposures, with detector dithering for exposure pairs • Fields are observed multiple times to identify RV variables and to build up S/N • For APOGEE-2, there will be Ma. NGA “piggy-back” observing for massive halo coverage in the north • NMSU 1 m telescope also feeds APOGEE-N when available, in single-object mode (bright and calibration targets)

Data infrastructure – Data infrastructure for APOGEE-2 will be similar to that of APOGEE-1,

Data infrastructure – Data infrastructure for APOGEE-2 will be similar to that of APOGEE-1, generalized to multiple observatories – APOGEE raw data and data products are stored on the Science Archive Server (SAS) – Reduction and analysis software is (mostly) managed through the SDSS SVN repository – Raw and reduced data described through SDSS datamodel – Data and processing documented via SDSS web pages and technical papers

Raw data – APOGEE instrument reads continuously (every ~10 s) as data are accumulating,

Raw data – APOGEE instrument reads continuously (every ~10 s) as data are accumulating, 3 chips at 2048 x 2048 each • Raw data are stored on instrument control computer (current capacity is several weeks of data) • Individual readouts are “annotated” with information from telescope and stored on “analysis” computer (current capacity is several months). These frames were archived to local disks that are “shelved” at APO (currently 20 x 3 TB disks) for APOGEE-1; currently NOT doing this for APOGEE-2 – “quick reduction” software at observatory assembles data into data cubes and compresses (lossless) for archiving on SAS • Maximum daily compressed data volume ~ 60 Gb

Raw data Does not include NMSU 1 m + APOGEE data LCO data will

Raw data Does not include NMSU 1 m + APOGEE data LCO data will be concurrent Total 2. 5 m raw data to date: ~22 TB

Initial processing • “quick reduction” software estimates S/N (at H=12. 2) which is inserted

Initial processing • “quick reduction” software estimates S/N (at H=12. 2) which is inserted into plate database for use with autoscheduling decisions • APOGEE-1 – Data transferred to SAS next day, transferred to NMSU later that day, processed with full pipeline following day, updated S/N loaded into platedb, initial QA inspection • APOGEE-2 proposal: – Process data at SAS location (Utah) and/or – Improve “quick reduction” S/N

Pipeline processing • Three main stages (+1 post-processing) – APRED : processing of individual

Pipeline processing • Three main stages (+1 post-processing) – APRED : processing of individual visits (multiple exposures at different detector spectral dither positions) into visit-combined spectra, with initial RV estimates. Can be done daily – APSTAR: combine multiple visits into combined spectra, with final RV determination. • For APOGEE-1, has been run annually (DR 10: year 1, DR 11: year 1+year 2, DR 12: years 1 -3); TBD for APOGEE-2 – ASPCAP: process combined (or resampled visit) spectra through stellar parameters and chemical abundances pipeline • For APOGEE-1, has been run several times – ASPCAP/RESULTS: apply calibration relations to derived parameters, set flag values for these

APOGEE data products • • • Raw data: data cubes (ap. R) Processed exposures

APOGEE data products • • • Raw data: data cubes (ap. R) Processed exposures (fairly specialized interest? ) – 2 D images (ap 2 D) – Extracted spectra (ap 1 D) – Sky subtracted and telluric corrected (ap. Cframe) Visit spectra – Combine multiple exposures at different dither positions – ap. Visit files: native wavelength scale, but with wavelength array Combined spectra – Combine multiple visits, requires relative RVs – ap. Star files: resampled spectra to log(lambda) scale Derived products from spectra – Radial velocities and scatter from multiple measurements (done during combination) – Stellar parameters/chemical abundances from best-fitting template • Parameters: Teff, log g, microturbulence (fixed), [M/H], [alpha/M], [C/M], [N/M] • Abundances for 15 individual elements – aspcap. Star and aspcap. Field files: stellar parameters of best-fit, pseudocontinuum normalized spectra and best fiitting templates Wrap-up catalog files (all. Star, all. Visit)

APOGEE data volume Raw data: • 2. 5 m+APOGEE: ~7 TB/year APOGEE-1 ~10 TB/year

APOGEE data volume Raw data: • 2. 5 m+APOGEE: ~7 TB/year APOGEE-1 ~10 TB/year with Ma. NGA co-observing • 1 m+APOGEE: ~1 TB/year • LCO+APOGEE: ~5 TB / year TOTAL APOGEE-1 + APOGEE-2 : ~100 TB Processed visit files: ~ 3 TB/year (80% individual exposure reductions) Processed combined star files: ~500 GB/100, 000 stars Processed ASPCAP files: raw FERRE files ~500 GB/100, 000 stars Bundled output: ~100 GB / 100, 000 stars TOTAL APOGEE-1 + APOGEE-2 (one reduction!): ~ 50 TB Processing time: DR 12 took 2 -3 weeks for all stages

APOGEE data access “Flat files” available via SDSS SAS: all intermediate and final data

APOGEE data access “Flat files” available via SDSS SAS: all intermediate and final data product files summary ``wrap-up” files (catalog) described by SDSS datamodel “Catalog files” available via SDSS CAS: apogee. Visit, apogee. Star, aspcap. Star, apogee. Field, apogee. Object, apogee. Plate Spectrum files available via SDSS API and web interface to SAS database (SASDB) Planning 3 public data releases in SDSS-IV (plus possible additional internal): DR 15: July 2017 (data through July 2016) DR 17: July 2019 (data through July 2018 – first APOGEE-S) DR 19: Dec 2020 (all data)

APOGEE software products • apogeereduce: IDL reduction routines (apred and apstar) • aspcap •

APOGEE software products • apogeereduce: IDL reduction routines (apred and apstar) • aspcap • speclib: management of spectral libraries, but not all input software (no stellar atmospheres code, limited spectral synthesis code) • ferre: F 95 code to interpolate in libraries, find best fit • idlwrap: IDL code to manage ASPCAP processing • apogeetarget: IDL code for targeting

APOGEE pipeline processing • Software all installed and running on Utah servers • Software

APOGEE pipeline processing • Software all installed and running on Utah servers • Software already in pipeline form (few lines per full reduction step to distribute and complete among multiple machines/processors) • Some need to improve distribution of knowledge and operation among team • Some external data/software required for ASPCAP operation • Generation of stellar atmospheres (Kurucz and/or MARCS) • Generation of synthetic spectra (ASSET, but considering MOOG and TURBOSPECTRUM)

APOGEE-S processing • Currently anticipated that raw data taken from APOGEE-S at LCO will

APOGEE-S processing • Currently anticipated that raw data taken from APOGEE-S at LCO will be transferred daily to SAS • Some bandwidth testing has been done, and sufficient bandwidth is anticipated • Data processing will then proceed in ~identical fashion to APOGEE-N data, assuming that instrument delivers data of comparable quality • APOGEE-N and APOGEE-S data will be stored/flagged separately, since they are not likely to be totally homogeneous

Personnel

Personnel

APOGEE data personnel (not including targeting) • Project management: Majewski, Sobeck, Hearty • Key

APOGEE data personnel (not including targeting) • Project management: Majewski, Sobeck, Hearty • Key data collaboration members: Holtzman, Allende-Prieto • Key external participants: Shetrone (pipeline coordinator) , Nidever (reduction lead), (Meszaros, ASPCAP; Smith, ASPCAP) • In-kind personnel contributions: Carrera (ASPCAP) • Paid postdoc+ personnel: Ana Garcia-Perez (calibration lead, 1. 0 FTE for one year), Neville Shane (database interfaces, 1. 0 FTE for duration TBD), Duy Nguyen (reduction, 1. 0 FTE for duration TBD), • Paid grad student personnel: Michael Hayden (daily reduction, 0. 5 grad student for project duration), Diane Feuillet (1 m operation, 0. 5 grad student for project duration), Nick Troup (ASPCAP, 1. 0 grad student for 1 year, possibly more).

APOGEE software responsibilities • apogeereduce • developer: Nidever, Holtzman, (Nguyen) • operation: Holtzman, (Hayden,

APOGEE software responsibilities • apogeereduce • developer: Nidever, Holtzman, (Nguyen) • operation: Holtzman, (Hayden, Nidever, Nguyen) • ASPCAP • grids: • ASSET: Allende Prieto / Koesterke • Turbospec: Zamora, Garcia-Hernandez, Sobeck, Garcia. Perez, Holtzman • MOOG: Shetrone, Holtzman, others • speclib • postprocessing: Allende-Prieto, Holtzman • ferre: Allende Prieto • idlwrap: Holtzman, Garcia-Perez • Operation: Holtzman (Troup) • Analysis: Garcia-Perez, Carrera, Meszaros, Garcia-Hernandez, Troup

Challenges / issues • Data analysis: automated abundances still challenging, work needed • Accommodation

Challenges / issues • Data analysis: automated abundances still challenging, work needed • Accommodation for variable LSF • Cooler star grids and improvements • Error analysis and propagation • Data access: • Support of multiple interfaces (CAS + SASDB) • Software management • Aging software (and developers? ) / deep knowledge • IDL based • Personnel • Need personnel who take initiative: adequacy of staffing level depends more on people than on FTE • Southern hemisphere operations development

END

END