DESIGN AND IMPLEMENTATION OF THE AUSTRALIAN NATIONAL DATA

























- Slides: 25

DESIGN AND IMPLEMENTATION OF THE AUSTRALIAN NATIONAL DATA SERVICE Andrew Treloar Director, ANDS Establishment Project 1

Outline § § Context Establishment Structure Future 2

Context 3

Australian Code for the Responsible Conduct of Research § Describes the responsibilities of institutions and researchers, including in management of research data &primary materials § Joint initiative of Universities Australia, ARC, NHMRC § Institutions are to: § retain research data and primary materials § provide secure research data storage and record-keeping facilities § identify ownership of research data and primary materials § ensure security and confidentiality of research data and primary materials § Researchers are to: § retain research data and primary materials § manage storage of research data and record-keeping facilities § maintain confidentiality of research data and primary materials http: //www. nhmrc. gov. au/publications/synopses/r 39 syn. htm 4

National Collaborative Research Infrastructure Strategy $542 M** over the five years: 2007 -2011 • • Evolving bio-molecular platforms and informatics Integrated biological systems Characterisation Fabrication Biotechnology products Optical and radio astronomy Integrated marine capability Structure and evolution of the Australian continent • • • Networked biosecurity framework Population health and clinical data linkage Terrestrial ecosystem research network **Note: scaled to EU or US economies this is analogous to 1 B USD per annum 5

Platforms for Collaboration: Major Investments 2007 -2011 Capability Computing Advanced models NCI - $26 M Da te pu m Co ta The Data Commons Data Federations ANDS - $24 M Interoperation Access Research connectivity Seamless reach AAF+AREN - $6 M Collaboration services Research workflows 6 ARCS - $20 M

Establishment 7

The ANDS Blueprint § Towards the Australian Data Commons (TADC) § Developed during 2007 by ANDS Technical Working Group § Mapped out coherent vision of what needs to be done in the data space § Available at http: //www. pfc. org. au/bin/view/Main/Data 8

TADC: Why Data? Why Now? § We are in an era of increasing data-intensive research § Almost all data is now born digital § Increasing amount of data generated (semi-)automatically § “Consequently, increasing effort and therefore funding will necessarily be diverted to data and data management over time” (Towards the Australian Data Commons (TADC), p. 4) 9

TADC: Need for standardisation § Software and hardware keep getting cheaper, wetware keeps getting more expensive § Fixing data management problems is enormously labour intensive and costly § “Consequently, standardisation within forms of data and simplification in the frameworks around retention, storage, access and use of data, and the elimination of differences whose resolution requires labour, must be made, if the on-going keeping and reuse of data is to remain affordable” (TADC, p. 5) 10

TADC: Role of data federations § With more data online, more can be done § Possible now to answer questions unrelated to reasons why data was collected originally § Increasing focus on cross-disciplinary science § “Consequently greater clarity is needed over control and access to community-funded data, and the means of aggregating, federating and accessing such data are increasingly important” 11 (TADC, p. 5)

The ANDS Vision § “As a vision, ANDS sets out to transform the disparate collections of research data around Australia into a cohesive corpus of research resources. ” (TADC, p. 5) 12

ANDS Assumptions § ANDS doesn’t have enough money to fund storage § Thus is predicated on institutionally-supported solutions (Australia lacks discipline-specific data centres) § ANDS aims to leverage existing activity, and coordinate/fund new activity § ANDS will only start to build the Australian Research Data Commons § ANDS governance and management arrangements are sized for the current funding 13

ANDS Establishment Project § Spent first eight months of 2008 doing detailed planning, consulting and consensus building to turn vision into reality § Generated the first of a series of annual business plans § http: //ands. org. au/andsinterimbusinessplan-final. pdf 14

Structure 15

Realising the Vision 16

ANDS Delivery Structure § ANDS has been structured as four inter-related and co-ordinated service delivery programs: § § Developing Frameworks Providing Utilities Seeding the Commons Building Capabilities § Plus candidate service development activities funded through National e. Research Architecture Taskforce projects 17

Developing Frameworks § Influencing relevant national policies § Building common understanding of data management issues and solutions across government, research funding agencies, and research intensive organizations § Encouraging moves in favour of disciplineacceptable default sharing practices § Largely centralised, with some specialised outsourcing 18

Providing Utilities Building and delivering national technical services to support the data commons Examples: Discovery See poster #16! Persistent identifier minting and management Collections registry to underpin discovery Outsourced delivery and insourced technical framework development Providing capability within ANDS for integration of existing systems into Australian Data Commons 19

Seeding the Commons § In targeted areas (because not enough resource to do everything), working to improve: § fabric for data management § amount of content § state of data capture and management § Plus, opportunistic content recruitment in year 1 § Selection process to identify targets § Placement of ANDS-funded staff, together with coinvestment 20

Building Capabilities § Improving level of capability for research data management and research access to data § Train-the-trainer model § Focussing both on early career researchers and research IT support staff § Building community around data management concerns § Largely distributed 21

Future 22

Australian Strategic Roadmap Review § Data Storage (p. 21) § National data-fabric, based on institutional nodes § Shared Data (p. 22) § More ANDS § Coordination Component (p. 23) § Integration of e. Research activities § Inclusion of data itself as collaborative research infrastructure (p. 9) § Expertise as an enabling infrastructure (p. 23) § Addition of humanities and social sciences § http: //www. innovation. gov. au/Science. And. Research/Docum ents/Strategic%20 Roadmap%20 Aug%202008. pdf § Not yet funded… 23

What do we want? § More researchers re-using more data more often § So we need to: § § lower the costs and raise the benefits have data be seen as a first class research output drive culture change in all the roles build partnerships between those with responsibilities 24

Questions? andrew. treloar@ands. org. au http: //andrew. treloar. net/ ross. wilkinson@ands. org. au http: //ands. org. au/25