Paolo Manghi Institute of Information Science and Technologies

  • Slides: 14
Download presentation
Paolo Manghi Institute of Information Science and Technologies - CNR Research data discovery in

Paolo Manghi Institute of Information Science and Technologies - CNR Research data discovery in Open. AIRE @openaire_eu

Populating the Open. AIRE scholarly communication graph Searching over the Open. AIRE graph Open.

Populating the Open. AIRE scholarly communication graph Searching over the Open. AIRE graph Open. AIRE - EOSC Hub - EC meeting | Amsterdam | 15 th Dec 2017

The Open. AIRE Graph

The Open. AIRE Graph

Building the graph and Dashboards Open. AIRE Dashboards Info Space Services Research communities Researchers

Building the graph and Dashboards Open. AIRE Dashboards Info Space Services Research communities Researchers (All) Brokering Funder Funding Cleaning De-duplication Validation Inference TERMS OF USE Organizatio n Project communiity Result Harvesting Content providers Innovators Research managers Funders GUIDE LINES Uploading Content Providers Publications repositories Data repositories Registries Software repositories OA Journals CRIS systems Publicatio n Data Source Software ORP Research Infras

Open. AIRE Data Model and Flows Funder Funding Organization deposition Project community Source Publication

Open. AIRE Data Model and Flows Funder Funding Organization deposition Project community Source Publication mining Result Research Data Software harvesting Other res. products

The Open. AIRE scholarly communication graph Building and maintaining an open metadata scholarly communication

The Open. AIRE scholarly communication graph Building and maintaining an open metadata scholarly communication graph of interlinked scientific products, in turn linked to Open Access information, funding information and community views Complete Graph De-duplicated Participatory

Content Acquisition Policy ALL Literature, Research data, Software, Other research products • Respecting the

Content Acquisition Policy ALL Literature, Research data, Software, Other research products • Respecting the Open. AIRE guidelines (Data. Cite metadata) • Using PIDs with resolvers

Harvesting: Revised Classification of Research Products Publications • • Article Preprint Report … Institutional/

Harvesting: Revised Classification of Research Products Publications • • Article Preprint Report … Institutional/ publication repositories Datasets • • Software Dataset Collection Clinical Trials … Journals/ publishers • Research Software • … Data repositories Software repositories Other Research Products • Service • Workflow • Interactive Resource • … Other Products repositories

Content acquisition policy transition: from Oct 2018 to November 2018 Literature 10000 80000000 60000000

Content acquisition policy transition: from Oct 2018 to November 2018 Literature 10000 80000000 60000000 40000000 20000000 Research Data 100+Mi 15000000 10+Mi 10000000 5000000 0 43374 0 43405 43374 43405 40 Mi links Software Other research products 80+K 80000 60000 3500000 3300000 40000 3100000 20000 0 43374 43405 2900000 43374 43405

Exploring the graph

Exploring the graph

Disovery of data in Open. AIRE • Search, browse, claim, and interlink products •

Disovery of data in Open. AIRE • Search, browse, claim, and interlink products • Navigation between interlinked objects Open. AIRE - EOSC Hub - EC meeting | Amsterdam | 15 th Dec 2017

Search plans in Open. AIRE Data maturity-driven search • Search datasets used in at

Search plans in Open. AIRE Data maturity-driven search • Search datasets used in at least K papers Community-driven search • Search for data in a community or used-cross-community Search beyond dataset metadata • From dataset files or from related entities (publications, project)

General challenges raised by experience Low quality metadata • Scientists should take seriously metadata

General challenges raised by experience Low quality metadata • Scientists should take seriously metadata curation and interlinking with other scientific products • Systems should be prepared to include new metadata/link information to existing depositions, to reflec the ecolution of the domain Metadata citation Vs metadata for reuse within or across disciplines • Datasets have different descriptions, driven by the intended usage, which drive the possible searches Varying granularity among communities • Communities should leverage a granularity level adequate to the intended discovery

Thank you! Paolo Manghi paolo. manghi@isti. cnr. it

Thank you! Paolo Manghi paolo. manghi@isti. cnr. it