Paolo Manghi Natalia Manola paolo manghiisti cnr it
- Slides: 25
Paolo Manghi Natalia Manola paolo. manghi@isti. cnr. it natalia@di. uoa. gr The Open. AIRE Infrastructure On Measuring Research Impact - The EGI use-case -
Outline • The What and How of Open. AIRE • Supporting research communities • Contexts, categories and concepts • User input • Results and analytics • Looking Ahead Developing the Open Sience Commons - Sept 25, Amsterdam 2
Open. AIRE in a nutshell European data infrastructure for scholarly communication • Facilitating discovery of research outcome across disciplines • Promotes & implements Open Access • Interlinks and contextualizes research outcomes • Integrates publication, data, software repositories, CRIS systems • Monitoring research outputs and measuring research impact • Open Access policy evaluation • Funding schemes: return of investment through impact • Research initiatives: research impact • Providing both human and technical infrastructure to make this possible! Developing the Open Sience Commons - Sept 25, Amsterdam 3
Deposit Publications & data Get support (NOADs) Visualize - Manage Enhanced Publications Curate & collaborate Search & Browse Research impact Citations, usage statistics Linked Content +++ Statistics Services for Project Coordinators, Project Funders, Funders +++ Infrastructure coordination Guidelines for use services Datasets Authors Link Classify Publications Text Mine Data Providers Mine for other info De-duplicate. Enrich Projects Mine for project Guidelines for data interoperability Metadata Usage data And pdfs 8, 700, 000 OA publications 460 validated repositories Fully compliant? Institutional CRIS Systems Link Cite APIs De-duplicate Organizations EC funding Metadata on data Infrastructure: data. National sources funding Publication repositories Publishes in OA journal Institutional & Thematic Open Access Journals Deposits in institutional or thematic repository CERN/Open. AIRE “catch-all” repository Data repositories Publishes. Data data Journals
Added Value: Integrated Scientific Information System Datasets Authors Publications Data Providers Projects Organizations 8. 7 mi publications 7 mi authors 460+ data providers 90 K publications linked to projects 2 funders 700 datasets linked to publications 33 K organizations 2731 publications linked to EGI Research Communities Developing the Open Sience Commons - Sept 25, Amsterdam 5
BEHIND THE SCENES Developing the Open Sience Commons - Sept 25, Amsterdam 6
Internal data flow Enriched Information Space Data Inference Open. AIRE Portal: Discovery & Impact measure Inferring Off-line Human Data Curation Public Information Space De-duplication Native Information Space Data source import Harvesting End-user claims Developing the Open Sience Commons - Sept 25, Amsterdam 7
RESEARCH ANALYTICS Developing the Open Sience Commons - Sept 25, Amsterdam 8
Monitoring OA policy Research Output Measures FP 7 66 K pubs – 7. 5 K projects FP 7 timeline - total FP 7 breakdowns Developing the Open Sience Commons - Sept 25, Amsterdam
Classification Text mining - Supervised techniques Developing the Open Sience Commons - Sept 25, Amsterdam 10
Beyond the Obvious Text mining – Unsupervised techniques (topic modeling) Example 1 FP 7 programmes connected through scientific pubs Research Trends Structural effects Interactive graphs Providing overview Developing the Open Sience Commons - Sept 25, Amsterdam 11
Example 2 How FP 7 programme areas are related Developing the Open Sience Commons - Sept 25, Amsterdam 12
EGI & OPENAIRE 1 -year pilot ended in May 2014 Official service release: Oct 2014 @www. openaire. eu Developing the Open Sience Commons - Sept 25, Amsterdam 13
Supporting communities • Enriched Open. AIRE data model • Context (e. g. “EGI”) • Category (“Virtual Organizations”) • Concept (“alice”) • Text mining algorithms tailored to community needs, integrated into Open. AIRE text mining framework Developing the Open Sience Commons - Sept 25, Amsterdam 14
What Open. AIRE does Behind the scenes • Extract full text from publications • if structured, use “funding” & “acknowledgements” fields • Scan text for matches against any of the EGI organization names provided • For each match, search surrounding context for • general terms & suggested acknowledgements (using word pairs) to add a confidence value to the match and eliminate false matches • For EC projects, we search not only for the project acronym (e. g. EGI-In. SPIRE) but also for the grant ID (261323) Developing the Open Sience Commons - Sept 25, Amsterdam 15
How to identify EGI Text mining on pdfs from repositories, publisher metadata Identify publications associated to EGI in terms of • Associated to EGI projects • Publication “enabled. By EGI: XYZ” • Publication ”supported. By EGI: XYZ” • Associated to a certain Virtual Organisation (VO) or National GRID Infrastructures (NGI) • Publication "used EGI" • Publication "used NGI: XYZ" • Publication ”produced. By VO: XYZ” • Associated to a certain EGI scientific discipline • Publication "related to EGI Scientific Discipline: XYZ” Developing the Open Sience Commons - Sept 25, Amsterdam 16
What EGI community should do STEP 1 Use proper acknowledgement in the publication Organisation Name Type Grant ID Suggested Acknowledgement We. NMR EC Project 261572 "The We. NMR project (European FP 7 e-Infrastructure grant, contract no. 261572, www. wenmr. eu), supported by the European Grid Initiative (EGI) through the national GRID Initiatives of Belgium, France, Italy, Germany, the Netherlands (via the Dutch Bi. G Grid project), Portugal, Spain, UK, South Africa, Taiwan and the Latin America GRID infrastructure via the Gisela project is acknowledged for the use of web portals, computing and storage facilities. " and the following article describing the We. NMR portals should be cited: Wassenaar et al. (2012). We. NMR: Structural Biology on the Grid. J. Grid. Comp. , 10: 743 -767. EGI-In. SPIRE EC Project 261323 The authors acknowledge the use of resources provided by the European Grid Infrastructure. For more information, please reference the EGI-In. SPIRE paper (http: //go. egi. eu/pdnon). ALICE VO n/a The ALICE collaboration gratefully acknowledges the resources and support provided by all Grid centres and the Worldwide LHC Computing Grid (WLCG) collaboration. LHCb VO n/a The Tier 1 computing centres are supported by IN 2 P 3 (France), KIT and BMBF (Germany), INFN (Italy), NWO and SURF (The Netherlands), PIC (Spain), Grid. PP (United Kingdom). We are thankful for the computing resources put at our disposal by Yandex LLC (Russia), as well as to the communities behind the multiple open source software packages that we depend on. NGI: PT NGI n/a This work makes use of results produced with the support of the Portuguese National Grid Initiative. More information in https: //wiki. ncg. ingrid. pt Developing the Open Sience Commons - Sept 25, Amsterdam 17
What EGI community should do STEP 2 • Option 1: follow the Open. AIRE guides • Publish in OA journal or deposit in OA repository – preferably the Open. AIRE compatible ones for Open. AIRE 2. 0+ guidelines (i. e. , link to funding) • Option 2: use the Open. AIRE portal “claiming” service to associate • any publication (within Open. AIRE or not) to EGI • results to additional EGI information: VO, classification, relationship Developing the Open Sience Commons - Sept 25, Amsterdam 18
User Input Developing the Open Sience Commons - Sept 25, Amsterdam 19
Developing the Open Sience Commons - Sept 25, Amsterdam 20
What does it look like Developing the Open Sience Commons - Sept 25, Amsterdam 21
Aggregated statistics Developing the Open Sience Commons - Sept 25, Amsterdam 22
Lessons learned & Best practices • Mandates on how to write acknowledgements are crucial but often missing • Try to collect as much information that may help with the mining beforehand. Even information that you may not think that it'll help, it may prove useful in the end. • Clean and normalize your input data (character encoding, stop-word removal, character case, special characters, etc. ). • Design your data mining methods to be very tolerant. In our case, suggested acknowledgements never appeared exactly as given in the input texts. • Do manual curation of the results to tune your data mining methods. Yes it is very labor intensive, but without it you'll be blind to your mistakes. • Design and implement your data processing methods to work in a streamed fashion and to be performant. Streamed design solves the “data bigger than memory” problem, performance design solves the “having to wait one week for results” problem. Developing the Open Sience Commons - Sept 25, Amsterdam 23
Roadmap • Release • Results of inference visible from the portal • Claim user interfaces available from the portal • Plan • Production release – ready by 1 st of October 2013 • Add more communities (e. g. , FET) Developing the Open Sience Commons - Sept 25, Amsterdam 24
Thank you! Looking forward to your questions and feedback www. openaire. eu @openaire_eu facebook. com/groups/openaire linkedin. com/groups/Open. AIRE-3893548 paolo. manghi@isti. cnr. it Developing the Open Sience Commons - Sept 25, Amsterdam 25
- Paolo manghi
- Cnr mri
- Cnr iom
- Cnr
- Libro matricola cnr
- Ibb cnr
- Ittig cnr
- Cnr mri
- Cnr meaning legal
- Gestform cnr
- Sls monitor
- Escuela de la lonja
- Natalia mara
- Natalia dryglewska
- Natalia s'est le bras.
- Natalia goncharova cats
- Contoh skala kecemasan
- Natalia ada
- Natalia narloch
- Natalia_ntc
- Filmes pornô brasileirinhas
- Natalia mara
- Natalia gralewska
- Natalia sierzputowska
- Rociguat
- Natalia aguayo