Status of CDI service connections Net CDF introduction





















- Slides: 21
Status of CDI service (connections, Net. CDF introduction, CDI ingestion service and set-up of CDI ingestion pilot) By Dick M. A. Schaap – Technical Coordinator Trieste – Italy, 3 – 4 March 2015, Combined EMODNet Chemistry 2 TWG and Sea. Data. Net II TTG
CDI service for discovery and unified access of data y d a e r l A c s e r t en c a t a 106 d d e t c e onn re o m and ay w r e und
Sea. Data. Net as driver behind many portals Total collection GEOSS portal IODE ODP portal Aggregated collection Data discovery And access Regional subsets Black Sea portal Sea > 100 data centres NODCs; HOs; GEOs; BIOs; ICES; PANGAEA Caspian portal Geo-Seas portal Thematic subsets Bathymetry Thematic subsets > 500 European data CDI Data Discovery originators and Access service Physics Chemistry Geology Biology
CDI Data Discovery & Access service Coverage March 2015: > 1, 71 million CDI entries from 106 data centres in 34 countries and 546 originators for physics, chemistry, geology, geophysics, bathymetry and biology; years 1800 – 2015; 85% unrestricted or under Sea. Data. Net licence
CDI Data Discovery & Access service 1 March 2015: 106 data centres connected
Situation (1 st March 2015) with DM and MARINE_ID In total 94 data centres connected by DM and 12 by interim solution. New DM V 1. 4. 5 available for Net. CDF support. 91 monitored by NAGIOS and 93 by Robot Shopper. Marine-ID now used by 80 data centres.
Net. CDF introduction into CDI service Sea. Data. Net. CDF (CF) format defined for profiles, trajectories and timeseries; indicated in L 24 by CFPOINT Delivery to users of CDI service Started with DM V 1. 4. 4. Principles: Partners with pre-processed files - MEDATLAS and ODV – (modus 3 or 1) and with database generated ODV files (modus 2) can rely on DM software to generate Net. CDF (CF) files on the fly; MARIS has to add CFPOINT as extra filetype to CDI metadata to have it working. MARIS has upgraded the CDI shopping mechanism to include CFPOINT Initial test with IFREMER => went ok. Additional test with IEO => issue. Repair by installing DM V 1. 4. 5 Data centres should indicate to MARIS a one-time CFPOINT addition to CDI range. Thereafter data centres themselves have to deliver new / updated CDIs including CFPOINT where appropriate.
Net. CDF introduction into CDI service IFREMER Argo floats Requesting a sample set
Net. CDF introduction into CDI service Chosing Net. CDF (CF) format
Net. CDF introduction into CDI service RSM – CFPOINT OK Downloaded in Net. CDF (CF) format
D 9. 2 incl updating D 4. 5 and D 5. 3: Harvesting of CDI XMLs – planned implementation Objective is to upgrade the submission and processing of CDI entries from data centres to the CDI portal service by means of harvesting and ingestion IFREMER has adopted Geo. Network for supporting CDI XML output of MIKADO and making it available by means of local CS-W service Testbed has been set-up with IFREMER, BSH and IEO that have installed and configured the Geo. Network software as provided by IFREMER. MARIS has tested central harvesting of CDI XML from the Geo. Network CS-W service (could also be provided by other local software than Geo. Network) Final check needed on XML consistency and criterium for harvesting only entries since specific date
Harvesting of CDI XMLs – check on subset MARIS has implemented a simple HTML page at: http: //www. gasandoil. nl/csw_hits/check_hits. php This allows pilot data centres to check whether the harvest selection goes well as it is supposed to be. MARIS does a selection on the CDI tag var 06 = REVISION-DATE OF DATASET; this should be stable and represent the date that the CDI XML was generated / updated. Query example for IEO for updates with revision date > 2014 -12 -31: http: //www. seadatanet. ieo. es/geonetwork-sdn/srv/eng/cswcdi? SERVICE=CSW&VERSION=2. 0. 2&REQUEST=Get. Records&constraint. Languag e=CQL_TEXT&CONSTRAINT_LANGUAGE_VERSION=1. 1. 0&CONSTRAINT=csw% 3 Arevision. Date%3 E%272014 -1231%27&output. Format=application%2 Fxml&result. Type=hits&max. Records=10&El ement. Set. Name=brief
Harvesting of CDI XMLs – check on subset The test page performs a series of date tests and gives results. See below for IFREMER: Clicking on Details gives corresponding CDI XML records with their Local_CDI_ID’s Action: IFREMER, IEO and BSH to check whether the revision date criterion performs ok.
D 4. 5 and D 5. 3: Ingestion of harvested CDI XMLs – status of online CMS for data centres Challenge: central CDI ingestion taking into account the staging process and relational model CDI – coupling table – local data MARIS applies a staging process for populating new and updated CDI entries, received from data centres: Validation of syntax and semantics if ok Duplicates check => report to data centre for check if ok Import of CDIs incl GML validation if ok CDIs in Import CDI service and user interface for visual check by data centres if ok Data centres must update Coupling Table and arrange Local Data sets if ok CDIs moved to production CDI service for public use Agreed to upgrade this into an online system whereby data centres can manage themselves => establishing data centre self responsibility + 24 / 7
Ingestion of harvested CDI XMLs – status of online CMS for data centres CMS under development
Ingestion of harvested CDI XMLs – status of online CMS for data centres Principle is that per data provider each time only ONE harvested batch will be processed; when ready, then next harvest will take place at regular intervals (e. g. each week) System runs in steps through a batch process in which data provider is asked to interact only a few times: Check identified XML errors (syntax – semantics) for possible repair in next batch Check identified potential duplicates (against import and production) and undertake action to delete real duplicates from the import database Check overall remaining CDI import and submit action for moving to production (after confirming up-to-date coupling table and new data availability) The overall process works in batch mode: for example moving from import to production takes place each night because also requires a lot of indexing.
Ingestion of harvested CDI XMLs – status of online CMS for data centres CMS with all possible ‘status’ examples
Ingestion of harvested CDI XMLs – status of online CMS for data centres Batch no Prevailing status Option to retrieve log of XML errors Option to inspect CDIs in import Options to retrieve logs, to inspect CDIs and to mark CDIs in import for deletion Possible interactions for data provider Click to mark for removal or moving to production (see next page)
Ingestion of harvested CDI XMLs – status of online CMS for data centres Final action: Removing remaining Batch or moving to Production
CDI ingestion pilot Pilot in Sea. Data. Net II with a number of data centres: IFREMER, IEO, BSH Will get access to the new online CMS for trying out. Test takes place in operational environment, but MARIS will monitor steps with roll-back options. Data providers must still also provide CDI XML by mail for comparison Note: CDI deletions continue as specials by email from data centres to MARIS
END