16122019 ESA FP Days ESTEC ODI Open Data

  • Slides: 15
Download presentation
16/12/2019 ESA FP Days, ESTEC ODI: Open Data Interface D. Heynderickx, DH Consultancy, Belgium

16/12/2019 ESA FP Days, ESTEC ODI: Open Data Interface D. Heynderickx, DH Consultancy, Belgium P. Wintoft, Solar Analytics, Sweden 1

16/12/2019 • Open Data Interface (ODI) is a database system for ingesting, processing, storing

16/12/2019 • Open Data Interface (ODI) is a database system for ingesting, processing, storing and retrieving space environment (and other) data and metadata in a My. SQL (Maria. DB) database. • Development started in 2008 (Swedish Institute of Space Physics, DH Consultancy); continuous development and maintenance since then • Current contractor team: DH Consultancy (D. Heynderickx), Solar Analytics (P. Wintoft) • Name of the TO in TEC-EES: Hugh Evans • Currently support for ~330 datasets. • Extensible with user added functionality. • Available for download at the European Space Software Repository (https: //essr. esa. int/): server, client, datasets ESA FP Days, ESTEC Project overview 2

 • • • Database configuration and setup Data file download and parser scripts

• • • Database configuration and setup Data file download and parser scripts Calculation of geographic and magnetic coordinates Processing hooks Definition of metadata 16/12/2019 • ODI server ESA FP Days, ESTEC ODI components • ODI client • REST, Excel interfaces to the database • APIs in various programming languages • Dataset definitions • Configuration files, download and parser scripts for 330 space environment datasets 3

credentials, user email, …) • Download scripts • Built-in support for wget • Dataset

credentials, user email, …) • Download scripts • Built-in support for wget • Dataset specific downloads (e. g. SFTP) • Data parsers • NASA/GSFC CDFexport tool to ingest CDF data • Python tool to ingest Net. CDF data • Dataset specific parser scripts • cron/quartz setup for automated download and ingestion • User may add functionality triggered by hooks on creation, download and ingestion 16/12/2019 • PHP engine for database communication and process flow • Database setup and configuration script • Create database and user accounts on the My. SQL server • Store configuration parameters (location of data files, SPACE-TRACK ESA FP Days, ESTEC ODI server software (I) 4

 • Download of TLEs from https: //space-track. org • Generation of spacecraft coordinates

• Download of TLEs from https: //space-track. org • Generation of spacecraft coordinates (GEI) using NASA/JPL SPICE library (http: //naif. jpl. nasa. gov/) • Magnetic coordinates for spacecraft in Earth’s magnetosphere • UNILIB library for calculation of L, L*, MLT, … • For fixed or record varying pitch angle(s) • IGRF + OPQ field models • A single command triggers the whole processing suite (data download, ingestion, pre- and/or post-processing), manually or as a cron/quartz job), e. g. : . /get_ingest. php goes_gp_mag_1 m_rt 16/12/2019 • Coordinates for Earth orbiting spacecraft ESA FP Days, ESTEC ODI server software (II) 5

ESA FP Days, ESTEC 16/12/2019 Database structure 6

ESA FP Days, ESTEC 16/12/2019 Database structure 6

16/12/2019 How to define a new dataset ESA FP Days, ESTEC PROBA 1/SREM L

16/12/2019 How to define a new dataset ESA FP Days, ESTEC PROBA 1/SREM L 2 configuration file DSCOVR real time data parser script raw_data_dir=SREM-DC/proba 1/srem/L 2 $rname = "http: //services. swpc. noaa. gov/products/solarfile_name_pattern=SREMPROBA 1_PACC_*_L 2. cdf. gz wind/plasma-6 -hour. json"; platform=PROBA 1 $json = file_get_contents($rname) platform_type=satellite $data = json_decode($json); instrument=SREM $tfile = afopen("load. tmp", "w"); • Create configuration file configuration. txt skeleton_file=SREM_PACC_L 2. skt array_shift($data); settings_file=SREM_PACC_L 2. set foreach ($data as $row) • Create a CDF type skeleton file { availability=public • If required, write download and/or parser scripts $epoch = explode(". ", $row[0]); download_script=wget_generic $epoch = $epoch[0]; • If required, write processing hook scripts wget_cut_dirs=3 $ms = intval($epoch[1]); wget_url=http: //srem. psi. ch/datarepo/L 2/proba 1/ • Run. /create_dataset. php <dataset name> $cdf_epoch = date_to_cdfepoch($epoch, $ms); to set up the cron_schedule=30 4 * * * $values array(); SQL data=table, create table #quartz_schedule=0 30 4 colums * *$i++) ? and insert metadata for ($i = 1; $i < count($row); odi_unilib_l {indexed_columns=epoch • Run. /get_ingest. php <dataset name> to ingest the data SPACETRACK_satnum=26958 if with (is_null($row[$i])) files, optional download and processing hooks $values[] = -999. 9; UNILIB_PREFIX= time. /cron_install. php data configuration file else real run • If. DSCOVR required, to add an entry in UNILIB_CALCULATE_LSTAR=false online_data=true $values[] = $row[$i]; crontab platform=DSCOVR } platform_type=satellite $values = implode(", ", $values); fprintf($tfile, instrument=FC "%20. 3 f, %s, %3 d, %s, NULL%s", $cdf_epoch, $ms, $values, LF); skeleton_file=FC_RT. skt } parser_file=. . /parser/DSCOVR_rt. php 7

ESA FP Days, ESTEC • Php script to render a web page with metadata

ESA FP Days, ESTEC • Php script to render a web page with metadata information on datasets and variables • Uses ODI php client -> very simple script 16/12/2019 Metadata browser 8

 • Scripts (batch, shell or php) started by triggers • Post-creation, pre/post download

• Scripts (batch, shell or php) started by triggers • Post-creation, pre/post download and ingestion • Generic: run on all datasets • Dataset specific scripts ESA FP Days, ESTEC • User defined process hooks 16/12/2019 ODI processing hooks • Post ingest example: copy GOES GEI and magnetic coordinates from one dataset to others 9

 • Post-ingest hook • Copies GEI and magnetic coordinates from level 0 datasets

• Post-ingest hook • Copies GEI and magnetic coordinates from level 0 datasets into l 1, l 2 datasets • NGRM data processing (SSA PR-SWE-XXI) • • Post-ingest hook Copies data from “raw” dataset to level 0 science dataset Merges in spacecraft state vectors from the OEM dataset Calculates magnetic coordinates using the ODI UNILIB tool 16/12/2019 • EMU data processing (GALEM) ESA FP Days, ESTEC ODI processing hooks • VALIRENE data cleaning and calibration • • Post-creation hook Copies data from a “raw” dataset Merges in cleaning flags Applies calibration factors 10

16/12/2019 • HTTP/REST (server/client), JSON output • HAPI (NASA Heliospheric API: https: //github. com/hapiserver/data-specification)

16/12/2019 • HTTP/REST (server/client), JSON output • HAPI (NASA Heliospheric API: https: //github. com/hapiserver/data-specification) server/client • Metadata browser • Java SE and My. SQL Connector/J JDBC driver • APIs for php, Java, IDL, Matlab, Jython, Python ESA FP Days, ESTEC ODI client software • Standardised procedure syntax • Outputs in language specific objects example (RENELLA LARB model production) • IDLExcel interface o. DB = OBJ_NEW('ODI_JDBC’) o. DB->connect query = “SELECT * FROM dataset_sampex_pet_ref 0 WHERE …” o. DB->query, nrows=nrows res = o. DB->get. Rows() res is an IDL structure of arrays. 11

16/12/2019 ACE: archive and real time EPAM, SWEPAM, MAG, SIS DSCOVR: real time IMF

16/12/2019 ACE: archive and real time EPAM, SWEPAM, MAG, SIS DSCOVR: real time IMF and plasma data (JSON streams) GOES: archive (SMS 01–GOES 15) and real time data SREM: PROBA 1, Integral, Giove. B, Rosetta, Herschel, Planck Magnetic and solar indices (Kp, Dst, F 10. 7, ISN, OMNI, …) Interplanetary particle datasets: HELIOS, IMP 8, Voyager, Pioneer, Wind • Radiation belt missions: AZUR/EI-88, GPS/CXD, S 3 -3/PT, CRRES/MEA/HEEF/PROTEL, UARS/PEM, SAMPEX/PET, NOAA/POES/SEM 2, XMM/ERMD, PROBA-V/EPT, RBSP/HOPE/MAGEIS/REPT/RPS, HIMAWARI/SEDA • “Proprietary” datasets: MIR/REM, STRV 1 B/REM, AMPTE/UKS, EQUATOR-S, ISEE 1/WIM/KED, Meteosat/SEM, Galileo/EMU, TSX-5/CEASE • • • ESA FP Days, ESTEC Datasets supported in ODI distribution (~330) 12

 • HIERRAS, SEPEM, SEDAT, SPENVIS, SAAPS, JHelio. Viewer, ESPREM, SAWS-ASPECS, SREN • RENELLA,

• HIERRAS, SEPEM, SEDAT, SPENVIS, SAAPS, JHelio. Viewer, ESPREM, SAWS-ASPECS, SREN • RENELLA, VALIRENE, ec. IRENE, PEM, SRREM, GALEM • SSA: P 2 -SWE-II (I-ESC SGIArv), VSWMC, P 2 -SWE-XIII (Sa. RIF), P 3 SWE-XXI (NGRM) 16/12/2019 • ESA projects ESA FP Days, ESTEC ODI applications • EC FP 7 projects • SEPServer • SPACECAST, SPACESTORM • EURISGIC 13

SELECT cdf_epoch, Why SQL for series data? as Y, odi_position_3 AS Z, odi_position_1 astime

SELECT cdf_epoch, Why SQL for series data? as Y, odi_position_3 AS Z, odi_position_1 astime X, odi_position_2 odi_unilib_l AS dependence mcl, odi_unilib_b_calc AS read mcb, odi_unilib_alpha_eq AS • Removes on data files: once and forget mca 0, 30. 0 AS deltat, 90. 0 AS pitchangle, • CDF epoch is the primary. AS keyf 2, countrate_3 -> very fast data by time countrate_1 AS f 1, countrate_2 AS selection f 3, countrate_4 AS f 4, countrate_5 range AS f 5, countrate_6 AS f 6, countrate_7 AS f 7, countrate_8 AS f 8, countrate_9 AS f 9, countrate_10 AS f 10, countrate_11 AS • SQL provides powerful and AS efficient data processing f 11, countrate_12 AS very f 12, countrate_13 f 13, countrate_14 AS and f 14, countrate_15 AS f 15 retrieval functionality FROM dataset_proba 1_srem_pacc_v 0 WHERE countrate_1>=0 AND • Applications be developed independent of dataset countrate_2>=0 AND can countrate_3>=0 AND countrate_4>=0 AND countrate_5>=0 AND countrate_6>=0 AND countrate_7>=0 AND Time averaging countrate_8>=0 AND countrate_9>=0 AND AS countrate_10>=0 AND FROM SELECT FLOOR(cdf_epoch/86400. 0 E 3) day, AVG(FPDU_1) countrate_11>=0 AND countrate_12>=0 ANDBYcountrate_13>=0 AND dataset_sampex_pet_h WHERE … GROUP day ASC countrate_14>=0 AND countrate_15>=0 AND (L, α 0) binning (odi_unilib_l>1 AND odi_unilib_l<5 AND SELECT AVG(fpdu_1), FLOOR(odi_unilib_l*100)/100 DEGREES(ATAN 2(odi_position_2, odi_position_1))>-120 AS ANDl, FLOOR(odi_unilib_alpha_eq) odi_position_1))<90 AS alpha FROM dataset_sampex_pet_h_ref 0 DEGREES(ATAN 2(odi_position_2, AND WHERE fpdu_1>=0 and fpdu_quality_esa_1=0 GROUP BY l, alpha DEGREES(ACOS(odi_position_3/SQRT(POW(odi_position_1, 2)+POW(odi_position_2, Complex queries 2)+POW(odi_position_3, 2))))>70) AND cdf_epoch>=datetocdfepoch('2003 -01 -01 00: 00') AND cdf_epoch<=datetocdfepoch('2003 -12 -31 23: 59. 999') ORDER BY Queries combining datasets cdf_epoch ASC • 16/12/2019 VALIRENE data selection query (PROBA 1/SREM counts) for IRENE validation ESA FP Days, ESTEC SQL tips and tricks • • 14

 • • • • Client interfaces Net. CDF parser REST and HAPI interfaces

• • • • Client interfaces Net. CDF parser REST and HAPI interfaces Documentation Generic parser for CSV files Enhance functionality for ingesting FITS and PDS data files Parsing of SPASE metadata (http: //spase-group. org) Support for new datasets (e. g. GOES-R) Support for JSON data types Software maintenance Dataset maintenance User support / help desk Any suggestions? 16/12/2019 • Updates ESA FP Days, ESTEC Future work 15