Everything Counts in Large Amounts Measuring the Impact

  • Slides: 33
Download presentation
Everything Counts in Large Amounts: Measuring the Impact of Usage Activity in Open Access

Everything Counts in Large Amounts: Measuring the Impact of Usage Activity in Open Access Scholarly Environments DI 4 R December 2017, Brussels Dimitris Pierrakos, Athena Research Center Jochen Schirrwagen, Bielefeld University @openaire_eu

Overview ● ● ● Open. AIRE infrastructure and Usage Statistics Service. Usage Data Collection

Overview ● ● ● Open. AIRE infrastructure and Usage Statistics Service. Usage Data Collection strategies. Using Piwik for tracking and analytics. Applying COUNTER rules. Metrics in the Repository Manager Dashboard. Relation to Open Metrics and Next Generation Metrics.

Open. AIRE 2020 • A pan-European Research Information platform to monitor OA research outcomes

Open. AIRE 2020 • A pan-European Research Information platform to monitor OA research outcomes from EC and other national funders. • Research analytics tools to promote new scientific metrics & support evidence-based decision-making. • Implementation of an Open. AIRE usage statistics service for usage data collected from data providers.

Usage Statistics in Open. AIRE ● Task in Open. AIRE 2020 covers: aligning policies

Usage Statistics in Open. AIRE ● Task in Open. AIRE 2020 covers: aligning policies and standards for gathering and sharing of usage data -> guidelines ○ considering legal aspects (data protection / data privacy) ○ relating usage statistics to other kinds of metrics ○ collecting and processing of usage data and producing consolidated, standards-based usage statistics ○ ● Task team: Athena Research Center, University of Bielefeld, University of Minho, Jisc IRUS-UK, Couperin + NOADs

Usage Statistics in the Open. AIRE Infrastructure ● Open. AIRE collects from 980 compatible

Usage Statistics in the Open. AIRE Infrastructure ● Open. AIRE collects from 980 compatible data providers ~21 Mio documents ● currently 32 active data providers participating in Usage statistics + IRUS-UK ● Usage statistics deployment under cc-0. ○ in Open. AIRE dashboard, portal and API.

Usage Statistics Service Features ● Tracking of views and downloads / collecting COUNTER reports

Usage Statistics Service Features ● Tracking of views and downloads / collecting COUNTER reports ○ Push or Pull collection workflows. ● ● ● Anonymisation of IP-addresses. Metadata de-duplication enables accumulation of views and downloads for same documents COUNTER Code of Practice compatibility. ○ standards based usage statistics. ○ enables comparability with statistics from other data sources.

Piwik Analytics platform • World's leading open-source analytics platform. • Valuable insights into website

Piwik Analytics platform • World's leading open-source analytics platform. • Valuable insights into website traffic and visitors activity. • Piwik collects and stores PII (personally identifiable information). • Keeps full data ownership and can control who has access. • Robot filtering plugin. • Compliant with EU regulations. • Recommended by privacy organizations such as ULD (Germany) and CNIL (France).

Piwik Facts Piwik Number of Hits per Month Unlimited Google Analytics 10 million Number

Piwik Facts Piwik Number of Hits per Month Unlimited Google Analytics 10 million Number of user accounts per login Unlimited 10 Data storage time Unlimited 25 months Number of properties (websites, apps etc. ) tracked per account Unlimited 50 Custom Variables 5 5 Data Export Unlimited 5000 rows Real time Analytics GA monitors user activity right after it happens, Piwik offers real-time web analytics although period of delay is not in all of its reports. explicitly stated.

2 -Tiers Collection Workflows for Usage Statistics ● ● ● Repository CRIS e. Journal

2 -Tiers Collection Workflows for Usage Statistics ● ● ● Repository CRIS e. Journal ● ● National Statistics Node Publisher IP-Anonym. PUSH PULL tracked event COUNTER Report Metadata-Index processing script Usage. Statistics-DB processing script

Tier-1: Push Usage Statistics Tracking Workflow An institutional repository is registered in Piwik. Server

Tier-1: Push Usage Statistics Tracking Workflow An institutional repository is registered in Piwik. Server side tracking: Plugins (Dspace) or patches (Eprints) using Piwik’s HTTP API. • Usage Activity is tracked and logged at Piwik platform in real time. • Ιnformation is transferred offline, using Piwik’s API, to Open. AIRE’s DBs for statistical analysis. • Statistics are deployed via Open. AIRE’s Portal or Sushi -Lite API. • •

Tier-1: Piwik Tracking Parameters Parameter Description id. Site the ID of the repository id.

Tier-1: Piwik Tracking Parameters Parameter Description id. Site the ID of the repository id. Visit a visitor/session ID (an 8 byte binary string) visit. IP (optionally anonymized) the IP address of the visitor action the action performed (view, download, outlink, etc) url the url of the requested item timestamp OAI-PMH Identifier agent referrer the date & time of the request the Open Access Initiative identifier of the item being viewed/downloaded the Web Browser and the operating system of the visitor The url linked to the item requested

Data Protection Aspects ● Usage events can be considered privacy-sensitive information (user-agent, ip-address, .

Data Protection Aspects ● Usage events can be considered privacy-sensitive information (user-agent, ip-address, . . . ) ● Usage statistics services must comply with data protection laws and regulations for both usage data- and service-providers ○ ○ but legal situation differs between the countries Open. AIRE must comply with the EU-General Data Protection Regulation ● Tracking plugins issued by Open. AIRE anonymize usage data already on the client-side

Usage Activity in real time

Usage Activity in real time

Real time Visitor Map

Real time Visitor Map

Cleaning and Consolidation • Applying data processing rules according to COUNTER Code of Practice:

Cleaning and Consolidation • Applying data processing rules according to COUNTER Code of Practice: • • ie. counting requests depending on session duration, tracing doubleclicks Bot filtering • • Piwik Bot Plugin COUNTER Robots Working Group Link of usage event with metadata record in Open. AIRE • Accumulate views and counts of de-duplicated records •

Repository Pilot Statistics

Repository Pilot Statistics

Tier-2: Collecting (Pull) Consolidated Usage Statistics Reports • Gathering of consolidated statistics reports from

Tier-2: Collecting (Pull) Consolidated Usage Statistics Reports • Gathering of consolidated statistics reports from aggregation services, such as IRUS-UK, using protocols such as SUSHI-Lite. • Statistics are stored to Open. AIRE’s DB for statistical analysis. • Statistics are deployed via Open. AIRE’s Portal or Sushi. Lite API.

Open. AIRE Usage Statistics DB entity. Id/orid source

Open. AIRE Usage Statistics DB entity. Id/orid source

Open. AIRE Repository Manager Dashboard ● Four steps to join Open. AIRE Usage Statistics

Open. AIRE Repository Manager Dashboard ● Four steps to join Open. AIRE Usage Statistics 1. Download. 2. Configure. 3. Deploy. 4. Validate (by Open. AIRE). ● Or enter SUSHI endpoint to let Open. AIRE collect COUNTER reports

Content Provider Dashboard Start Page

Content Provider Dashboard Start Page

Content Manager’s Datasource selection for Metrics

Content Manager’s Datasource selection for Metrics

Enable Metrics for selected Datasource

Enable Metrics for selected Datasource

Configure Metrics for selected Datasource 000 01233456

Configure Metrics for selected Datasource 000 01233456

Summarized Usage Statistics on the content provider level

Summarized Usage Statistics on the content provider level

Usage Statistics on the Article Level

Usage Statistics on the Article Level

SUSHI-Lite Interface ● Available as beta with the help of IRUS-UK ○ http: //beta.

SUSHI-Lite Interface ● Available as beta with the help of IRUS-UK ○ http: //beta. services. openaire. eu/usagestats/sushilite/ ● Supports COUNTER R 4 compatible reports: Article Reports (AR) and Book Reports (BR) using identifiers like openaire, doi, oai-record-id ○ Item Reports (IR) ○ Repository Reports (RR) using identifiers issued by Open. AIRE or Open. DOAR ○ Journal Reports (JR) using identifiers like ISSN ○

SUSHI response example (JSON) Repository Report Item Report

SUSHI response example (JSON) Repository Report Item Report

Open. AIRE: A Usage Statistics Hub for Responsible Metrics • • Quantitative indicators for

Open. AIRE: A Usage Statistics Hub for Responsible Metrics • • Quantitative indicators for research • Governance • Management • Assessment Dimensions • Robust metrics in terms of accuracy and scope; • Humble metrics recognizing that quantitative evaluation should support qualitative, expert assessment; • Open and Transparent metrics; • Diverse metrics by field in order to support the plurality of research and researcher career paths across the system; • Reflexible metrics for recognising, anticipating and updating the systemic and potential effects of indicators.

Considering the HLEG Altmetrics Recommendations • Standardization: following COUNTER Code of Practice • •

Considering the HLEG Altmetrics Recommendations • Standardization: following COUNTER Code of Practice • • • by update to COUNTER R 5 by contribution to COUNTER Robots Working Group Put usage statistics into context with conventional and alternative metrics and (open) peer review

Next Steps ● Develop Piwik plugins for other Repository platforms ● ● (eg. Fedora,

Next Steps ● Develop Piwik plugins for other Repository platforms ● ● (eg. Fedora, Samvera) Promote the service to content provider managers Support national usage statistics initiatives to become a node in Open. AIRE Usage Statistics Contribute to the Open Metrics concept and vision Activities in Open. AIRE-Advance starting in 2018: ○ support LA Referencia to set up a regional usage statistics network and interlink ○ working towards Open Metrics

Collaboration with EOSC-hub ●Standardize usage statistics to enable assessment of research impact ○ Standardize

Collaboration with EOSC-hub ●Standardize usage statistics to enable assessment of research impact ○ Standardize usage statistic metrics across Open. AIRE and EOSC-hub ○Collaborate with RDA (e. g. Make Data Count Bo. F working group) ○Promote common guidelines to and across communities ○Take EC rules and GDPR regulations into account ●Enable the collection/aggregation of usage stats from content providers ○ Adopt Open. AIRE and EOSC-hub services for collecting user statistics, services in scope: ■ EGI: Accounting System, App. DB ■ EUDAT: DPMT, B 2 SHARE, B 2 FIND, B 2 SAFE ○Adopt Open. AIRE Usage Statistics Services to collect user stats for all products of science ■ e. g. literature, datasets, software, research objects ■ Integrating with EOSC-hub services for usage statistics/metrics

References ● Open. AIRE Usage Statistics Deliverable Report ○ https: //doi. org/10. 5281/zenodo. 1034163

References ● Open. AIRE Usage Statistics Deliverable Report ○ https: //doi. org/10. 5281/zenodo. 1034163 ● Repository Tracking Plugins (github) ○ ○ https: //github. com/openaire/Open. AIRE-Piwik-DSpace https: //github. com/openaire/EPrints-OAPiwik ● SUSHI-Lite API (beta) ○ http: //beta. services. openaire. eu/usagestats/sushilite/

Q&A Dimitris Pierrakos, dpierrakos@gmail. com Jochen Schirrwagen, jochen. schirrwagen@uni-bielefeld. de

Q&A Dimitris Pierrakos, dpierrakos@gmail. com Jochen Schirrwagen, jochen. schirrwagen@uni-bielefeld. de