Engineering Reports Day 2 November 2017 EN Reports
Engineering Reports Day 2 November 2017
EN Reports • Day 1 – Build 8 a Release – Tool Working Group • Day 2 – Digital Object Identifiers – PDS Home Page Discussion – Search Discussion – 2018 Tech Session
Digital Object Identifiers November 2017
Agenda • Background on OSTI/DOI & September Workshop • OSTI Pilot and Funding Status • Moving Forward 4
OSTI Overview • OSTI became a member and an allocation agency for Data. Cite in 2011. • OSTI assigns Digital Object Identifiers (DOIs) to Department of Energy datasets – registers those DOIs with Data. Cite. • The Interagency Data ID Service (IAD) provides this same service to other U. S. Federal Agencies – Desire to make their data available for citation and discovery long into the future. – Developed and operated by OSTI • OSTI has recently agreed to support PDS DOI registrations 5
Data ID Clients DOE A 2 EDAP AMES Lab Ameri. Flux Biogeochemistry Feedbacks CDIACTCCON CXIDB DOE-GDR DOE-MHKDR ESS-DIVE Flux. Net ISCN LBNL-MP NGEE - Tropics NREL ORNL-ARM ORNL-NGEEA ORNL-OLCF Other Federal Agencies USQCD UW-PEGASUS WTR-SFA ARCUS. org PNNL LANL SLAC BNL ORNL JLAB NHAAP DRPOWER NETL-EDX ORNL-NSCAT IMMM-SFA NIH-NIMH USDA-ADC DOTNTL BRICS NASA-PDS US_EPA NITRD IMPACT 6
Two-day “research data” focused workshop • Held by the Department of Energy's (DOE) Office of Scientific and Technical Information (OSTI) • on September 13 -14 th, 2017 • at OSTI in Oak Ridge, TN • Follows 2016 Data ID Service Workshop • Held at Stanford Linear Accelerator Laboratory in April 2016 • Share information about the • DOE Data ID Service • Interagency Data ID Service 7
Broader Goals of The Workshop • Gain a deeper understanding of researchers' data needs – Determine how to better support those needs. • Desire to tackle evolving changes in the data world – influence the growth and enhancements to the products and services provided • Discuss issues that data researchers are beginning to confront • Present initial plans for IAD redesign – Discuss the evolving metadata behind data citations 8
Required Metadata* • Data Cite – – – – Identifier Creator Title Publisher Publication Year Resource Type Landing Page • OSTI – – – Research Org/Submitting ORG URL Contact Name Contact ORG Contact email Contact Phone * Per Crystal Sherline, OSTI 9
Output of Workshop • Current – NASA PDS is classified as an Interagency Data (IAD) Client – Current metadata support is very minimal • OSTI and Data. Cite are considering future upgrades which may expand • Mapping PDS into OSTI’s IAD service will be a lossy mapping – OSTI desires to continue to provide our DOIs • Our primary contact is Crystal Sherline • Future – OSTI is impressed with the PDS’s information architecture – Interested in our future needs especially in regard to linking data, documents, software, and people – Improve PDS practices with data citation – In contact with IAD redesign lead, Neal Ensor • Starting on collection development implementation Getting a persistent identifier in place will allow the process to evolve. 10
Assigning Credit for PDS Data 1 • Need an appropriate classification scheme for assigning credit • two-tier author/editor system that works for journals isn't really quite good enough for the work involved in generating archive quality data • Acknowledgements and References • E. g. , archive collections to acknowledge the grants that paid for their creation. • Link data to papers • Link data to other related data • And we should also consider author identifiers • E. g. , ORCIDs 1 A. Raugh, Semi-Random thoughts Re: PDS / ADS – DPS, email Wed 10/11/2017 11
PDS OSTI Pilot and Status • At the November 2016 F 2 F, PDS launched a pilot project for creating and registering PDS 4 data using Digital Object Identifiers (DOI) – Pilot project successfully concluded ‘test’ DOI registrations of Bundles & Collections from ATMOS, PPI, SBN. • At April 2017 F 2 F, results were presented and MC approved using OSTI/IAD as DOI registration service for PDS. – GSFC is working to get an IAA in place. – OSTI has agreed to allow PDS to use their service while the IAA is put in place. 12
Technical Background • A limited set of updatable metadata is sent to OSTI which provides a link (landing page) into PDS. – PDS metadata is extracted and mapped to the OSTI provided metadata • DOIs are then cited in publications. • Benna, Mehdi; Lyness, Eric. LADEE Neutral Mass Spectrometer Data. 2014. https: //doi. org/10. 17189/1408898 13
Metadata Required* What ESDIS Requires What OSTI Requires • Data Cite – – – – Identifier Creator Title Publisher Publication Year Resource Type Landing Page • OSTI – – – Research Org/Submitting ORG URL Contact Name Contact ORG Contact email Contact Phone 14
Registering DOIs • OSTI created production account for PDS registrations. – EN registered DOIs for the following Ladee NMS products to test the process: Product DOI Bundle 10. 17189/1408898 Collection Data Calibrated 10. 17189/1408892 Collection Data Derived 10. 17189/1408897 Collection Data Raw 10. 17189/1408893 Collection Document 10. 17189/1408894 15
Moving forward with DOIs • Assuming a successful IAA will be in place, – EN will work with nodes to use OSTI services to register and link PDS 4 data to a DOI • This includes having nodes review the metadata mappings for their data • This will allow citations to move forward today – Recommend DDWG launch a DOI tiger team to determine whether any IM changes are needed to better support citations for future PDS deliveries and submit SCR, as appropriate – Update PDS website to provide citation instructions for PDS 4 data registered with DOIs • Links into website updates 16
Backup 17
PDS Home Page Discussions November 2017
Introduction • The PDS home page is operating with parallel content for PDS 3 and PDS 4. – This was necessary to development PDS 4 without perturbing PDS 3. • PDS 4 content has grown significantly at both the homepage and across nodes. • PDS-wide migration from PDS 3 to PDS 4 is a good opportunity to improve the layout. • EN will propose an upgrade project for PDS. 19
20
21
22
Project Structure • EN would like to make quick progress. At the same time there is also an opportunity to address broader web space architecture needs and integration. • Proposal: Create a tiger team (intention is that it is no permanent) with node representatives which will – Task 1: Address navigation, content, and transparency of PDS 4 information on PDS home page in support of PDS 3 -to-PDS 4 migration (review beta changes at April 2018 F 2 F). – Task 2: Update look-and-feel and integration of websites across PDS. (MC proposal and concept due at August F 2 F). this is bigger! 23
Task 1: Navigating PDS 4 Information • EN has an initial site map to be discussed by TT • Address PDS 3 vs PDS 4 content – Determine amount of visibility for PDS 3 standards information • PDS 4 Information – – – Integrate into PDS website Improve organization and access Address gaps in information Improve access to training material Add information to support PDS 3 -to-PDS 4 migration • Develop proposal for transparency of information for MC review 24 • Produce improved layout for review at the April F 2 F
Task 2: Upgrade PDS Web Space • Develop a proposed concept and plan to be presented to the PDS Management Council which addresses the following: – – – Include overall architecture and organization Increase leveraging of APIs Improve linking of content across nodes Propose new look-and-feel for home page Integrate DOI landing page improvements Propose schedule for implementation 25
Questions/Comments 26
Search Discussion November 2017
Agenda Search Service Background and Capabilities Deployment at EN Metrics API Access Search UI (Mission Archive Pages, Search Tools, and Data) • Discussion and Recommendations • • • 28
Approach Definition: EN uses the term search to encompass a set of software services to extract, query, classify, and present data results. PDS releases a search service that has three components • Search Service: A generalized service based on Apache Solr • Search Core: A tool to build an integrated index to drive search • Service UI: A presentation layer for showing search results 29
Search Service • Search Service is based on Apache Solr, one of the most widely used open source search engines and no-SQL databases in the world – Highly scalable – Many instances of science data systems using Solr and a similar tool called Elastic • Apache Solr provides the ability to – Extract and classify data based on a set of parameters – Build an online, massive database for sub-second querying based on a set of parameters – Provide REST-based APIs to query; very powerful query language – Set rankings 30
Search Service Today @ EN • Configured by the PDS 4 model – PDS 3 and PDS 4 catalog and context information – Includes resources such as archive support pages and search tools that relate to context information • Numerous queries can be made through REST-based APIs – – – Return all missions with target X Return all data sets for mission Y Return a list of instruments used on mission Z Return a list of supporting tools for target T Etc • We can control the ranking algorithm and we can tweak results but we are still dependent on the metadata we have 31
Driving Metrics from Search Service 32
Search API • Uses Apache Solr REST-based API • Support for other APIs (e. g. , PDAP) • Documentation available – https: //pdsengineering. jpl. nasa. gov/development/pds 4/8. 0. 0/ search/docs/pds 4_pds_search_protocol. pdf • Available for use by PDS, IPDA, and others • Provides a basis for different user interfaces to navigate PDS 33
Examples • PDS Protocol – Simple Syntax • https: //pds. nasa. gov/services/search? term=cassini saturn • https: //pds. nasa. gov/services/search? investigation=cas sini-huygens&target=saturn – Advanced Syntax (Solr/Lucene) • https: //pds. nasa. gov/services/search? q=investigation: c assini-huygens AND target: saturn AND start-time: [2001 -0101 T 00: 00. 000 Z TO 2011 -12 -31 T 23: 59. 999 Z] • PDAP Protocol – https: //pds. nasa. gov/services/search/pdap? RESOURCE_CLAS S=DATA_SET&TARGET_NAME=SATURN 34
Architecture Search Service supports the PDS and PDAP protocols enabling development of other portals and applications on this infrastructure. Search Service indexes metadata from multiple Registry Service instances to support a given search interface. 35
PDS Search UI • Uses search API to query database and classify resources into mission archive pages, search tools, and data available across PDS nodes. – Resources must be available and registered – Metadata needs to be reviewed because it affects rankings – Additional curated metadata can be used to improve rankings • Constraints can be applied using “facets” that are configured by the PDS model. – Use of the facets is fairly common practice 36
Mission Archive Pages • • • Top choice to send users Registered using PDS 4 model Mission Information User Guides Data Access Need missing pages registered 38
Search Tools • Search tools are registered as services based on the PDS 4 model. – Nodes and EN can work together to improve description, keywords, etc – Resources are related to missions, targets, etc • Need to reduce number of registered search tool instances at nodes (e. g. , Galileo Image Search to Image Search) – Can we accomplished by ensuring search parameters are passed on – This will greatly help ranking of target based searches 39
Data • Queries can be specific parameters or across the set of parameters • Returned data includes targets, collections, investigations, collections, instruments, bundles, data sets • Unexpected results should be examined to understand the assigned metadata Target-centric missions often have many different targets observed. 40
Discussion & Recommendations • Problematic metadata is going to create problematic results. – We need to address these over time. • In PDS 4, create a mission archive page for every new and migrated mission. • Review node search tool rankings, for some like Image Atlas, we’d like to move to one search link that uses the search parameters to configure itself • We need nodes to help tune search results – There could be conflicts of rankings (e. g. , raising and lowering of results) – Metadata improvements can be made to tune results • Recommendation: Use Tool Working Group to coordinate search algorithm and search tool registration and integration across the nodes – Include tuning of results – Report progress at Spring F 2 F • Recommendation: Need external review and comments 41
2018 Tech Session November 2017
Next Tech Session • Date: Feb 13 -15, 2018 • Location: Pasadena, CA • Venue: To be arranged by NRESS • Key topics: Information Model, PDS 4 Training, Tools, System Services, Future Planning
Day 1 Agenda (Full Day) Theme Topic Format Welcome Information Model LDD Issues Workshop Bundle Creation (Context, XML, etc) Workshop IM 2. 0 Discussion Tools Tool Overview (status, progress, plan) Deployable service for webifying commandline tool Discussion Workshop Validation In-depth Tools open discussion (installation, desired features) Discussion PDS 4 Training Discussion Demo
Day 2 Agenda (Full Day) Theme Topic Format Information Model LDD Solutions Discussion Software Services Service Overview (status, plan) Discussion Containerized deployment Discussion Search In-depth (harvest config, registry & search) Workshop/Discussion Service open issues Discussion Technical Breakout Session(s) Discussion/Workshop
Day 3 Agenda (Half Day) Theme Topic Format DOI IM Improvements; Process Discussion Migration Node Migration Processes and Tool Support Discussion Planning Future Discussion Wrap up
Next Steps • Finalize Agenda • Finalize Web page https: //pdsengineering. jpl. nasa. gov/content/2018_tec h_session • Finalize logistics
Backup
Proposed Site Map • Home – PDS description along with quick links. – Slider to highlight latest releases and announcements. • Nodes – Highlight Node resources. A step beyond linking to Node sites. • Data – Develop high-level search page. – Current interface becomes advanced search feature. • Tools – Highlight key tools and link to the Tool Registry. • Documentation – Improved organization focusing on PDS 4 content • Support – Migrate content from the Resource Catalog popup. 49
- Slides: 49